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Throughout history those concerned with the control of human be- 
havior— -parents and educators, businessmen and lawmakers — have 
acted on the belief that rewards and punishments are powerful tools 
for the selection and fixation of desirable acts and the elimination of 
undesirable ones. This commonsense view of the nature of learning 
has not received undivided support from the professional students of 
behavior. Almost forty years after the first enunciation of the law of 
effect as a formal doctrine of learning, the problem of reinforcement is 
still the subject of heated controversy among the proponents of rival 
theories of learning. The question has been asked time and again 
whether this conflict is irreconcilable (47, 142). Perhaps there is some 
truth in all, or at least some, of the conflicting views, and the main source 
of difficulty may be the proneness of many theorists to attempt an 
explanation of all learning in terms of their favorite hypotheses. After 
years of debate, the possibility of a one-principle theory of learning — - 
whether this principle be contiguity, substitution, reward, or expec- 
tancy — is still doubtful. This doubt is greatly strengthened by a survey 
of the history of the law of effect and an evaluation of its present status. 

Historical Sources 2 3 

Even though the law of effect is invariably associated with the name 
of Thorndike, who first used this phrase (264), the principles embodied 
in the law have a long history antedating by many years the modern 
laboratory studies of learning. The formulation of the law reflected 

1 The writer wishes to express his great indebtedness to Professor Gordon W. Allport 

who first suggested this review. His unceasing interest and generous advice have been 
invaluable. 

3 The law of effect was previously reviewed in this journal by Waters (312). The 
writer has greatly benefited from this discussion. For the sake of a unitary presentation 
some of the problems raised by Dr. Waters are discussed again in this paper. 
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the impact on psychology of three major trends in the history of 
thought: associationism, hedonism, and evolutionary theory. 

Associationism. Ever since Aristotle first wrote of the laws of associa- 
tion of thoughts (similarity, contrast, and contiguity), many philoso- 
phers and psychologists have treated the problems of learning, memory 
and coherent thinking in terms of associations formed among mental 
elements. Hobbes, Locke, Hartley, Hume, Mill, Wundt, and many 
more subscribed to the doctrine of association of ideas. Some pinned 
their faith on the principle of contiguity alone; others introduced simi- 
larity as a basic law; some writers, such as Hartley, suggested physio- 
logical foundations for the streams of associations. But in spite of diver- 
gencies about this or that aspect of the doctrine, they all agreed that 
mental organization resulted from the joining and weaving together 
of elements — for a long time the term idea predominated — into the 
content of consciousness. Associationism is, almost by definition, ele- 
mentaristic and connectionistic. It is elementaristic because it needs 
conceptual elements which get associated. It is connectionistic because 
it deals with the principles of connection among these elements. These 
two basic characteristics of associationistic thought are made explicit 
in the law of effect. 

Hedonism. Associationism is concerned with the laws of connection 
of mental elements: it has little to say about the role of motives in 
the acquisition of learned responses. Yet the study of learning cannot 
proceed long before coming face-to-face with the problem of motivation. 
In its approach to motivation, modern learning theory has deep roots 
in the philosophy of hedonism, and the development of the psychology 
of learning has been characterized by a stubborn defense of hedonistic 
principles on the one hand and a struggle for the emancipation from 
hedonism on the other. To the hedonist, pleasure and pain are the 
governing principles of behavior. The search for pleasure and the avoid- 
ance of pain are the mainsprings of conduct, individual and social, and 
the basis of social interaction and organization, The idea is very old 
indeed. Plato saw in pleasure and pain important motives of human 
action; Aristotle called them the basis of the will, Hobbes’ interpreta- 
tion of social life revolved around man’s seeking of pleasure and avoid- 
ance of pain. The pleasure-pain principle finally found its most system- 
atic expression in the doctrine of utilitarianism (Bentham) which 
regarded self-interest as a sufficient principle to account for most of 
individual and social action. The pleasure-pain principle was still very 
much alive when the law of effect was formulated. 

Evolution, The doctrine of evolution served to bring associationism 
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and hedonism more closely together. The problem of adaptive behavior 
became central. Some responses to environmental stimuli are superior 
to others and lead to the survival of the fittest. The question at once 
arises how these superior adaptive responses are selected from the mul- 
tiplicity of responses of which an organism is capable, and then fixated 
and perpetuated. To those who tried to answer this question, hedonism 
and the pleasure-pain principle provided the principle of selection, 
and the laws of association the mechanism of fixation. 

Early Formulations of a Law of Effect 

The convergence of associationsim, hedonism, and evolutionary doc- 
trine is exemplified in the formulation of what is in substance a law of 
effect by Spencer (250) and by Bain (8). Their theories have been 
reviewed and analyzed in detail by Cason (33) but since they embody 
salient features of the effect doctrine they will be briefly summarized 
here. Spencer's (250) basic assumption is that in the course of natural 
selection there has been established in the various species a correlation 
between the pleasant and the beneficial, and a similar correlation be- 
tween the unpleasant and the injurious. 3 That which is pleasant is main- 
tained and repeated and proves beneficial to the biological organism. 
That which is painful is abandoned and the organism is protected from 
injury. Spencer then described a physiological mechanism for the se- 
lection of useful pleasant acts and the elimination of harmful unpleasant 
acts. Organisms respond to environmental stimuli in a highly variable, 
essentially random fashion with diffuse discharges of neural energy. 
Among these random responses there will be sooner or later one (ac- 
cidental) response which is successful (for example, produces food for the 
animal). After success will immediately come pleasurable sensations 
with a large discharge of nervous energy toward the organs engaged 
in the successful act, e.g., in eating. This heightened discharge of ner- 
vous energy will render the successful channels of muscular action more 
permeable. On recurrence of the circumstances the neural discharge 
will no longer be diffuse but will be channeled into the successful move- 
ments. With every repetition the successful channels will be made more 
and more permeable, until stable nervous connections have been or- 
ganized. 

3 This assumption is echoed in James' Principles where we read: "If pleasure and pains 
have no efficacy one does not see why the most noxious acts, such as burning might not 
give thrills of delight, and the most necessary ones, such as breathing cause agony" 
(132, p. 143). Another exponent of a pleasure-pain theory of learning was J. Mark Bald- 
win (9). See also Pyle (21S). 
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Bain (8) formulated a similar theory: spontaneously begun move- 
ments which are accidentally successful cause pleasure, and with the 
pleasure there is an increase in vital energy. A few repetitions of this 
coincidence of pleasure and movement lead to a neural connection so 
that pleasure or the idea of the pleasure will evoke the successful move- 
ment at once. In the case of pain, the sequence is the reverse, leading to 
a decrease of vital activity and a blocking of movement. 

The theories of Spencer and Bain have been summarized in some de- 
tail here because certain salient features of the pleasure-pain theory have 
proved extremely persistent in later developments: 

1. The organism is said to respond to the problem situation by random 
movements and spontaneous discharges of the muscles. Later analyses of trial- 
and-error responses have been attacked because they seemed to imply that an 
animal's initial responses in a problem situation are random (148). This would 
be an extremely hazardous assumption to make, for in a problem situation an 
animal will rarely respond completely at random; even his trial-and-error re- 
sponses always have a certain directionality though they may at first appear 
random from the observer’s viewpoint. Thoughtful analysts of trial-and-error 
learning have recognized this fact and pointed out that the assumption of 
randomness is not necessary for a trial-and-error theory of learning (24 0). 

2. The causal efficacy of pleasure and pain in fixating successful responses 
and eliminating unsuccessful ones is explicitly asserted. Pleasure and pain are 
purely psychic concepts and mechanistically inclined thinkers have been re- 
luctant to endow them with such power over muscular responses (29, 33, 112). 
Hence the need to invoke physiological correlates of pleasure and pain to reduce 
their action to mechanistic principles. This need was felt not only by Spencer 
and Bain but also by later writers continuing the tradition of a pleasure-pain 
theory of learning (252, 264, 297). Most of these physiological explanations 
have been highly speculative, as are certainly those advanced by Spencer and 
Bain. There is, for example, no evidence to support the contention that pleasure 
is accompanied by heightened neural activity and pain by lessened activity. 
This may sometimes be the case but the opposite may be equally true, espe- 
cially in the case of intense pain. The trouble was that a pleasure-pain theory was 
really a theory of psychic causes leading to physical effects. Physiological ex- 
planations have been largely ad hoc and invoked for the sake of consistency 
with basic axioms about the mechanistic nature of the learning process. 4 

3. The third feature which the pleasure-pain view bequeathed to modern 
learning theory is its insistence that a repetition of a successful response will 

4 The vexatious issue of hedonism was sidestepped by Hobhouse (110) who described 
effect in terms of confirmation and inhibition. Similarly Holmes (114) speculated that 
the congruity or incongruity of an act with the activity in progress constituted its effect. 
Such formulations, however, only postpone the consideration of the role played by satis- 
faction and annoyance. As Stephens (257, 258) has pointed out, if an organism unex- 
pectedly stumbles on a valuable outcome, the failure of the response to confirm the 
expectancy will not prevent the response from recurring. Similarly, a congruous ex- 
pected response may be eliminated if it leads to unpleasant consequences. 
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strengthen it. There are many cases where this is true but probably equally 
many where the opposite is equally true, where continued repetition of a 
response will weaken that response (59). 

4. Let us finally note the central emphasis which a pleasure-pain theory 
puts on the motor (movement) aspect of the connection between situation and 
response. It is the movement which is strengthened by pleasure and increased 
neural activity. Later formulations of the law of effect, especially in conjunction 
with conditioning principles (127) have placed equal emphasis on the strength- 
ening of afferent-ejT’grewi connections by reinforcement. Critics have insisted 
that movements as such are rarely learned and used by animals only as means to 
an end. It is probably safe to say that changes in the stimulus as the animal 
perceives it and interprets it (3, 29, 30) are just as essential as the fixation of suc- 
cessful movements. 

The New Associationism: Thorndike’s Laws of Learning 

The combination of a modified pleasure-pain (success-failure) 
philosophy and connectionism found its most systematic and challeng- 
ing expression in the work of Thorndike. Even though he later dis- 
claimed hedonism (273), Thorndike’s thinking has reflected that of the 
earlier exponents of the pleasure-pain principle as well as the associa- 
tionist doctrine. His central importance derives from the fact that he 
embarked on a monumental series of experiments, extending over half 
a century, to obtain empirical verification of his laws of learning. Partly 
on the basis of his study of the problem-solving behavior of animals, 
Thorndike formulated two basic laws of learning: the law of exercise 
and the law of effect. 

Other things being equal, the law of exercise makes learning a function of 
the number of repetitions of the stimulus-response connections. 

The law of effect was stated as follows: “Of several responses made to the 
same situation, those which are accompanied or closely followed by satisfaction 
to the animal will, other things being equal, be more firmly connected with the 
situation, so that when it recurs, they will be more likely to recur; those which 
are accompanied or closely followed by discomfort to the animal will, other 
things being equal, have their connection with the situation weakened so that, 
when it recurs, they will be less likely to occur. The greater the satisfaction or 
discomfort the greater the strengthening or weakening of the bond’’ (264, p. 
244). More succinctly expressed, the law as originally stated asserts that suc- 
cess stamps in and failure stamps out. 

In the light of careful experimental analysis Thorndike later aban- 
doned the law of exercise, or at least he relegated this law to a very 
minor position, and for him the law of effect became altogether central 
to the learning process (267, 269). At various times Thorndike’s state- 
ments of the law of effect have been accompanied by an exposition of 
its probable physiological correlates. The physiological locus of rein- 
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forcement by effect is put squarely in the neurons (e.g. 264, 267, 269, 
270, 273). Satisfying and annoying processes are “related to the main- 
tenance and hindrance of the life processes of the neurons” in whose 
changing synaptic conduction learning consists (264, p. 24). In other 
words, pleasure or satisfaction facilitates synaptic conduction, renders 
synapses more permeable, whereas discomfort or pain blocks synaptic 
conduction. 

The subsequent history of the law of effect has shown two major 
foci of interest. 

1. The first major focus of interest has been Thorndike’s law itself. Ever 
since its first formulation it has been subjected to a long series of attacks, at- 
tacks which were vigorously countered by Thorndike and his associates both on 
the level of theoretical discussion and in the experimental laboratory. In the 
course of these discussions Thorndike himself has shifted his theoretical position 
to some extent and reformulated the law. Concurrently many of the important 
parameters of reinforcement (such as amount of reward, frequency of reward, 
delayed reward, spread of effect, etc.) were investigated. 

2. A second focus of interest has developed around the integration of the 
law of effect with the facts and theories of conditioning. In Hull’s systematic 
theory of learning (127), the law of effect occupies a central position, On the 
other hand, the law of effect has been singled out for attack by those who are 
opposed to a conditioning-reinforcement type of learning theory (e.g., Tolman 
and his associates: 289, 294, 315). 

The law of effect has thus become an issue around which systematic 
differences in behavior theory have been crystallized. Defense and 
condemnation of the law of effect have almost become symbolic of 
widely divergent approaches to the problems of behavior analysis and 
learning theory. The discussion which follows will be centered around 
these two foci of interest and will also attempt to cover a representative 
selection of studies dealing with the parameters of reinforcement. 

The Classical Objections to Retroaction 

Let us turn first to a consideration of the classical objections to the 
law of effect. In the language of the law of effect, the consequence of a 
response strengthens the response, i.e., the effect “works back” on the 
connection which it follows. Such a formulation involves the assumption 
of retroaction. Critics asked, how can an effect work upon a response 
which is already passed (29, 33, 46, 212, 219)? The logical difficulty is 
obvious, and Thorndike has not been inclined to minimize it. He has 
suggested that the physiological equivalent of the connection does not 
vanish instantaneously but is still present when the consequence (re- 
ward or punishment, knowledge of results) occurs (269, p, 481). The 
original statement concerning backward action should then be modified 
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to read that one type of after-effect (satisfier or annoyer) acts on other 
after-effects of a connection (the persisting physiological equivalent of 
the original connection). 

The same logic for bridging the temporal gap between connection 
and consequence and avoiding the cul-de-sac of retroaction underlies 
Hull’s use of the concept of stimulus-trace. Upon cessation of a stimulus 
(exteroceptive or proprioceptive) the afferent impulse continues its 
activity for a finite short period, and a short-range temporal integration 
is achieved. Hull writes, “This perseverative stimulus-trace is biologi- 
cally important because it brings the effector organ en rapport not only 
with environmental events which are occurring at the time but with 
events which occurred in the recent past, a matter frequently critical 
for survival” (127, p. 385). Not only may the physiological correlate of a 
stimulus or response persevere in time but symbolic processes may help 
the human learner and, to a lesser extent, the animal learner to bridge 
the temporal gap between the occurrence of the connection and the 
action of the satisfier (192). 

There have been other attempts to slip from under the horns of the 
dilemma of retroaction. Accepting the apparent fact of backward action 
at its face value, retroflex circuits in the brain (297), neural irradiation 
(96), changes in the electrical resistance of neural connections (251) and 
their readiness to conduct (252, 267), the rearousal of just-active path- 
ways (44) — all these were suggested as possible mechanims mediating 
the action of satisfiers and annoyers. These physiological explanations 
can be called neither right nor wrong. They are clearly speculative and 
many steps removed from the level at which experimental verification 
is at present possible. 

The most radical solution of the problem has been to deny retroac- 
tion altogether and to affirm that in spite of the appearances to the 
contrary effect always operates in a forward direction. What is modified 
is not the connection which is passed and done with but the stimulus 
on the occasion of its next appearance. “The burned child shuns the 
fire not because pain did anything to his movements, but because, since 
that pain, the stimulus has changed; it is now flame plus fear, no longer 
flame plus curiosity” (112, p. 218f.). Similarly, Carr ascribed the fixa- 
tion and elimination of responses primarily to the sensory consequences 
of an act rather than to the strengthening of S-R connections. “These 
consequences do not influence the portion of the act that preceded 
them . . . they do affect the subsequent functioning of the act” (29, 
p. 96). Whether or not it is possible to cut the Gordian knot of retroac- 
tion by insisting that effects exert their influence in a forward direc- 
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tion, criticisms such as those of Hollingworth and Carr were important 
because they shifted the emphasis from the response to the stimulus. 

In trying to evaluate the arguments for and against retroaction, one 
is struck by the fact that the battle has been fought largely on the plane 
of logic (312). There can be no doubt that rewards and punishments 
following responses modify these responses on the occasion of their 
recurrence. Not enough is known about the basic mechanisms of learn- 
ing to decide whether consequences of a response can or cannot act back 
on a connection or whether perforce they must exert their effect in a 
forward direction. Operationally speaking, the consequences of a re- 
sponse always act in a forward direction since they can only be tested 
on the occasion of a future occurrence of the stimulus situation, Physio- 
logically speaking, when all is said and done we had best admit our ig- 
norance of the action of cortical neurones. Perhaps part of the confusion 
has stemmed from a premature reification and physiologizing of the 
terms involved. The words connection and effect are logical constructs, 
the words stimulus and response are generic terms denoting classes of 
events with certain properties in common (237). It is only too tempting 
to forget the limitations of these concepts and to manipulate abstrac- 
tions as if they were clearly delineated entities with equally unequivocal 
physiological counterparts. For analytical purposes we need such con- 
cepts as stimulus , response, and connection, but are we not over-optimis- 
tic when we assume that the nervous system makes its division along the 
same lines? If misplaced physiological concreteness is avoided one may 
escape logical difficulties such as the dilemma of retroaction. 

What Is the Nature of Satisfiers? 

Whereas some critics were most concerned with the mechanisms 
mediating effect, others focussed their attention on the nature of the 
satisfiers and annoyers to which reference is made in Thorndike’s law. 
Although Spencer and Bain, in whose tradition Thorndike continued, 
frankly invoked pleasure and pain as agents responsible for the fixation 
and elimination of responses, Thorndike’s law has been a law of effect, 
not affect (113). He carefully defined satisfiers and annoyers in terms 
independent of subjective experience and report. “By a satisfying state 
of affairs is meant one which the animal does nothing to avoid, often 
doing such things as to attain and preserve it. By a discomforting state 
of affairs is meant one which the animal avoids and abandons” (264, 
p. 245). Although admittedly free of hedonism, such a definition of 
satisfiers and annoyers has faced another serious difficulty: the danger 
of circularity. The critic may easily reword the definition to read: “The 
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animal does what it does because it does it, and it does not do what it 
does not do because it does not do it.” This reductio ad absurdum is 
probably not entirely fair but it points up the danger of the definition in 
the absence of an independent determination of the nature of satisfiers 
and annoyers. The satisfying or annoying nature of a state of affairs 
can usually be determined fully only in the course of a learning experi- 
ment and cannot then be invoked as a causal condition of learning with- 
out circularity (108). In their experimental work Thorndike and his 
associates have made no significant attempts to establish the satisfying 
or annoying nature of their rewards and punishments independently of 
the learning experiment. The tacit assumption has been made through- 
out that the words “Right” and “Wrong” and/or small money gains 
and losses are, indeed, satisfying and annoying states of affairs. As 
Tolman and his co-workers have argued on the basis of strong experi- 
mental evidence, such events (including shocks) can equally well be 
conceived as signals or "emphasizers,” providing the subjects with in- 
formation as to the correct response (294). We shall return shortly to the 
problem of cognitive (informative) vs. satisfying and annoying after- 
effects of a connection. 

Thorndike’s definition of satisfiers and annoyers has been defended 
against the stigma of circularity by emphasizing its empirical adequacy 
(171). “The primary fact is that there is a state of affairs, whatever 
its psychological classification, which happens as a result of, or at least 
after, an act. The state of affairs may be symbolic ... or it may be 
primarily sensory ... or it may be complexly perceptual” (171, pp. 
576-577). This statement is fully in line with Thorndike’s own reformu- 
lation of the nature of a satisfying state of affairs. In his more recent 
writings Thorndike has conceptualized the action of satisfiers in terms 
of the O.K. reaction (273, 277), a fundamental confirming reaction 
brought to bear by the organism on connections between situations and 
acts. He describes the O.K. reaction as the “unknown reaction of neu- 
rones which is aroused by the satisfier and which strengthens connec- 
tions on which it impinges” (273). This reaction is independent of 
sensory pleasure. It is “far from logical.” It strengthens “connections 
which are wrong, irrelevant, and useless.” It is said to bear little rela- 
tion to the intensity of the satisfier. Neither specific motivation nor 
effects relevant to a specific motivation are any longer assumed, 

Stripped of virtually all defining properties and qualifications, the 
law does indeed have a very wide range of applicability but only at the 
expense of vagueness. The sum and substance of the argument now is 
that something happens in the organism (nervous system) after an act 
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is performed, The fact that something happens influences further ac- 
tion. This something is, however, so little defined that it has almost no 
predictive efficiency. The O.K. reaction has no measurable properties, 
the conditions for its occurrence are so general as to embrace almost 
every conceivable situation. Hence the operation of the O.K. reaction 
can be inferred only ex post facto, after learning has taken place. But 
here we are impaled again on the horns of the dilemma of circularity. 

Perhaps because the reformulated empirical law of effect had a de- 
gree of generality which made it an unsatisfactory tool for the predic- 
tion of specific facts of learning, several auxiliary principles were 
gradually introduced in the connectionist framework: the principle of 
belongingness, according to which S-R correlations which “hang to- 
gether” are learned more easily than connections lacking this quality; 
the principle of impressiveness which states that vivid stimuli are favored 
in learning; the principle of polarity according to which stimulus-re- 
sponse sequences function most readily in the order in which they were 
practiced; the principle of identifiability stating that the most easily 
identifiable connection is most easily learned; and finally the principle 
of availability which has it that the more available a response the more 
easy it. is to connect it to a stimulus (227, 269). It is easy to see that 
most of these principles are closely akin to the concepts of Gestalt 
psychology. According to Brown and Feder (18), “these conditioning 
factors which for Thorndike are so many ad hoc hypotheses, are logically 
prior to his other laws.” These writers claim that Thorndike’s theory 
of learning could be successfully rewritten in terms of Gestalt psychol- 
ogy. Whether or not this assertion is true, 6 the use of these auxiliary 
principles serves to point up the inadequacy of the O.K. reaction as a 
universal principle of learning. 

It is very probable that the modifications in Thorndike's description 
of the process of reinforcement were largely influenced by the results 
of his own experimental labors over the years. Most of his experiments 
were concerned with rote verbal learning and the acquisition of simple 
skills (269), and it is the results of these experiments which were inter- 
preted in terms of the confirming reaction and such auxiliary concepts 
as belonging and identifiability. Other advocates of the law of effect, 
especially those whose experimental interests have been more in the 
fields of conditioning and discrimination learning, have tended to insist 

8 Certainly Thorndike would not agree. He professes his "inability to understand" 
Gestalt psychology. He feels suspicious of the concepts of Gestalt psychology “because 
for a time they were offered as a substitute for psycho-vitalism" (267, p. 125), He con- 
cludes that a connectionist theory is “far simpler and more in accord with what the 
neurons are and can do” (p. 131). 
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on a version of the law in which the term effect retained much more of 
its original meaning and had direct reference to the primary and second- 
ary needs of the organism. Let us consider, for example, a recent formal 
statement by Hull of his law of primary reinforcement which, in his own 
words, is distinctly related to the law of effect. The law reads as fol- 
lows : 

Whenever an effector activity occurs in temporal contiguity with the 
afferent impulse, or the perseverative trace of such an impulse, resulting from 
the impact of a stimulus energy upon a receptor, and this conjunction is closely 
associated in time with the diminution in the receptor discharge characteristic 
of a need, there will result an increment to the tendency for that stimulus on 
subsequent occasions to evoke that reaction (127, p. 80). 

Reduction of a need is a critical factor in the reinforcement process, 
and by need Hull means a condition in which any of the commodities 
necessary for the survival of the organism are lacking or deviate seri- 
ously from the optimum (127, p. 17). Disturbances in the "optimal con- 
ditions of air, water, food, temperature, intactness of bodily tissue, 
and so forth” give rise to needs and activities of the organism for the 
search of survival, It is the highly important adaptive role of the learn- 
ing process to reinforce those movements which will lead to need reduc- 
tion and thus help the animal to survive. Hull recognizes, of course, 
that much learning takes place in the absence of immediate reduction 
of biological needs, To account for such learning, the principle of second- 
ary reinforcement is invoked. "The power of reinforcement may be 
transmitted to any stimulus situation by the consistent and repeated 
association of such stimulus situation with the primary reinforcement 
which is characteristic of need reduction” (127, p. 97), Thus all reinforce- 
ment depends directly or indirectly on the reduction of "viscerogenic” 
needs. Since reinforcement is a basic condition of learning, it follows 
that learning is primarily an adaptive process in the interest of species 
survival. Biological needs provide the foundation on which the complex 
structure of a lifetime of learning is built. 

Hull’s formulation of the process of reinforcement has been echoed 
by many other investigators in the field of learning. In a long series of 
experiments (some of which will be discussed in detail below) O, H. 
Mowrer, for example, has attempted to produce experimental evidence 
and theoretical argument for a version of the law of effect which is 
closely parallel to that of Hull in its emphasis on tension reduction. A 
typical statement by Mowrer claims that 

. , . learning is dependent not upon the mere association, or temporal contiguity, 
of stimuli (or responses) but rather upon the occurrence of a state of affairs 
which has been variously designated as goal-attainment, problem-solution, 
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pleasure, success, gratification, adjustment, reestablishment of equilibrium, 
motivation-reduction, consummation, reward (190). 

Elsewhere Mowrer makes it unequivocally clear what he means by 
satisfaction. He writes: 

Most writers agree that satisfaction ought to be equivalent to pleasure and 
dissatisfaction equivalent to pain. But, then, what do we mean by “pleasure"? 
One view is that we can equate pleasure to drive or tension-reduction and pain 
to drive or drive-increase. . . . Satisfaction, pleasure, and drive-reduction are 
strictly equivalent (185). 6 

Mowrer’s interpretation of effect or a satisfying state of affairs is 
frankly and emphatically hedonistic. 7 Ultimately all behavior modifi- 
cation is mediated by pleasure (tension-reduction). The statement has 
a very familiar ring. The history of learning theory has completed a 
cycle, and the views of Spencer, Bain, and Baldwin are with us again 
though transplanted into the conditioning laboratory and accompanied 
by great refinement and sophistication in experimental techniques and 
procedures. The historian of learning will be especially interested to note 
that while Thorndike, the original advocate of the law of effect, has 
frequently dissociated himself from a hedonistic interpretation, there 
has emerged a group of neo-hedonists whose interpretation of the law is 
much closer to its historic roots. They were willing to assume the re- 
sponsibility of hedonism in order to make the law more useful in the 
design and interpretation of experiments. 

To many psychologists hedonism today is no more acceptable than 
it was at the turn of the century. A vigorous attack against the preser- 
vation of a hedonistic doctrine in the law of effect has been leveled by 
Allport, who has been especially concerned with the implications of a 
law of effect for the growth and development of personality. A strict 
adherence to a hedonistic law of effect would imply that “personality 
grows through the production of tentative trial-and-error acts, some of 
which are selected by the Great God Pleasure for establishment, and 
some by the Great God Pain for banishment” (2, p. 154). Allport be- 
lieves that the main fallacy of hedonism is the confusion of the by- 
product of a complex process with the process itself. He would agree 
that pleasure (tension-reduction) at times is an indicator of the success- 
ful accomplishment of an act (4) but emphatically denies that pleasure 

6 Similar emphasis on tension-reduction as the basic process of reinforcement is found 
in the writings of Meunzinger (199) in connection with his work on the effect of electric 
shock on learning. 

7 There is an important way in which such a view differs from that of the classical 
hedonists. For the hedonists pleasure was a motive, whereas for the modern learning 
theorists it is tension, or discomfort, which is the motive while pleasure is the result of 
tension-reduction which leads to learning (192). 
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is the main causative factor in the' fixation of responses. He also takes 
issue with the law of effect because it perforce implies a segmental ap- 
proach to the learner, ascribing learning to the satisfaction of this drive 
or that drive with the main emphasis on the reinforcement of responses 
(3), For Allport it is the person who learns, is rewarded, and utilizes 
his success in his future adjustemnt. Allport thus pitted an ego-oriented 
psychology of learning against the neo-hedonism of the law of effect. To 
this controversy we shall return presently. 

In attempting to evaluate the controversy which has raged around 
the definition of satisfiers one is struck by the key importance of the 
hedonistic issue. Certainly hedonism is an immediate ancestor of the 
law, and now that the principle of effect has reached an uneasy ma- 
turity it is clear that it cannot deny its origin without sacrificing much 
of its vigor. When the law is stripped of hedonistic implications, when 
effect is not identified with tension-reduction or pleasure (as by Thorn- 
dike), the law of effect can do no more than claim that the state of 
affairs resulting from a response in some way influences future responses. 
Such a statement is a truism and hardly lends itself to the rigorous de- 
duction of hypotheses and experimental tests. If a neo-hedonistic 
position is frankly assumed (as, e.g,, by Mowrer) the law becomes an 
important tool for research, provided “satisfaction” is independently 
defined and not merely inferred from the fact that learning has occurred. 
The hedonistic position can be defended for the range of events for which 
it is experimentally demonstrable. It is probably fair to say that up to 
the present this range is fairly narrow and largely in the domain of 
animal learning. 

The main source of difficulty has been the universality claimed for 
tension-reduction as a principle of reinforcement. Hull and Mowrer, 
on the one hand, and Allport, on the other, probably would agree that 
reduction of the food need is a critical factor in a rat’s learning the true 
path through a maze. With the aid of the principle of secondary (and 
higher order) reinforcement, tension-reduction can then be made the 
ultimate condition of all learning. Such a generalization Allport rejects. 
Crucial experimental tests to arbitrate between these two interpreta- 
tions of more complex forms of learning still seem to be lacking. For 
the time being, therefore, acceptance or rejection of the hedonistic 
postulate must remain a matter of philosophical preference, 

The Nature of Punishment 
Thorndike' s View of the Effect of Punishment 

The preceding discussion has dealt primarily with the nature and 
action of satisfaction (reward). As originally stated, the law of effect 
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was a two-pronged statement, putting the stamping-out powers of 
punishment on a par with the stamping-in effects of reward. In the 
light of experimental results Thorndike no longer considers punishment 
as an effective agent for the elimination of wrong responses. Indeed, 
again as a result of experimental evidence, a punished connection is now 
held to be strengthened more by sheer occurrence than it is weakened 
by punishment. Punishment is not the dynamic opposite of reward; it 
is not even capable of overcoming the effect of sheer occurrence (269, 
279). This view is, of course, an implicit resurrection of the law of exer- 
cise since sheer occurrence (frequency) is now credited with a certain 
degree of effectiveness in fixating responses, an effectiveness, moreover, 
which surpasses that of punishment. 

An impressive number of experimental investigations have been 
conducted by Thorndike and his associates to demonstrate the ineffec- 
tiveness of punishment (162, 163, 165, 166, 223, 268, 269, 276, 299). In 
most of these experiments the punishment consisted of the announce- 
ment Wrong made by the experimenter, accompanied sometimes by a 
small fine or an electric shock. The effect of the punishment was assessed 
in terms of the number of repetitions of the punished response as com- 
pared to the number of repetitions that would be expected on the basis 
of chance, i.e. , if the subject were guessing at random. The results of the 
experiments led Thorndike to conclude that punishment, instead of 
weakening or “stamping-out” the wrong response, may have a variety 
of effects depending on the specific nature of the annoyer and the propen- 
sities of the organism. The important point is that what an animal is 
led directly to do by an annoyer may or may not make the repetition of 
the punished response less likely (269). If punishment does lead to 
the elimination of a response, its action is indirect-, it leads to variability 
of behavior, thus increasing the opportunities for the occurrence of 
the correct response which is then reinforced by the direct action of the 
satisfier (OK reaction). 

Thorndike’s revised view of punishment startled the psychological 
public. Not only did it contradict the belief in the practical value of 
punishment which had become almost an axiom in our social life, it 
was also clearly at variance with a considerable body of other experi- 
mental data. We shall now consider (1) the criticisms leveled against 
Thorndike’s work on the effect of punishment, and (2) other theoretical 
and experimental developments bearing on the role of punishment in 
learning. 

Criticisms of Thorndike' s View of The Effect of Punishment 

One serious criticism of Thorndike’s data was statistical in nature. 
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The effect of punishment was measured in terms of the deviation of the 
obtained number of repetitions from the number of repetitions to be 
expected by chance, Thus, if there were five possible responses to an 
item, Thorndike would consider the chance expectation of each re- 
sponse to be .20, and the obtained percentages were expressed as devia- 
tions from .20, Thorndike was thus applying the principle of indiffer- 
ence which considers alternative events equally likely in the absence of 
information to the contrary. The trouble is that there seems to be some 
information to the contrary, viz., that the “natural choice frequency 
is usually not a chance one” (123). Whether or not Thorndike’s correc- 
tion for chance is adequate is an empirical question. It is necessary to 
compare the extent of actual repetition without after-effect with the 
extent of repetition following announcement of Wrong. Experimenters 
who used an empirical rather than an a priori baseline for the evaluation 
of the effects of Wrong find that punishment does have a weakening 
influence which is commensurate with the strengthening influence of 
reward (254,255,283). Thorndike and his associatesfail to find such an 
effect even when favoritism of response is taken into account (162, 166). 

Failing to find a significant weakening of connections following the 
announcement of Wrong, Thorndike denied the efficacy of punishment 
in general. 8 Yet even within the restricted framework of a typical Thorn- 
dike experiment, and with as highly specific a “punishment” as the 
announcement of Wrong, the results vary considerably with the param- 
eters of the experimental situation. The medium by which the punish- 
ment is conveyed is important; it may make a difference whether 
Wrong is announced by spoken word or by, say, a signal light. The effect 
of the medium, the sheer “something happening” after a response may 
be powerful enough to obscure the weakening influence of punishment. 
Such, at least, was the conclusion of Stephens who argued that whenever 
punishment strengthens a bond, such strengthening may be due to the 
physical medium, whereas reward and punishment have directly oppo- 
site effects when measured from the baseline of informationless after- 
effects (a flash of light, a nonsense syllable) (255). 9 Jones, (139) on 

8 While the majority of experiments were done with human subjects, Thorndike illus- 
trated the superiority of reward also with animal subjects (268). 

9 Thorndike and his associates also found that informationless after-effects such as a 
click led to more frequent repetitions than the announcement of Wrong. They were 
unwilling to conclude, however, that punishment had weakened a response and argued 
that a “neutral” after-effect was ambiguous enough to lead to self-administered reward 
since the subjects were free to interpret the neutral signal as a reward (165, 166). Such 
an argument comes dangerously close to question-begging. It indicates an unwillingness 
even to consider the possibility that punishment may have a weakening effect on a con- 
nection. 
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the other hand, found that the medium by which punishment is con- 
veyed makes little difference and that the elimination of errors depends 
on the position of the punished response in the pattern to be learned. 

Whatever the effect of the medium by which the punishment is 
conveyed, the initial strength of the association will be an important 
determinant of the effect. A strong association can be considerably 
weakened, a weak association which is close to zero cannot be inter- 
fered with much further. Stephens was able to show experimentally 
that strong connections (responses which appeared early and were 
persistent) are markedly and dependably weakened by the announce- 
ment of Wrong whereas initially weak ones are not. Correspondingly, 
Right has little, if any, influence on initially strong associations but con- 
siderably aids in making weak bonds strong (254, 256). Tilton (284, 
285) also emphasizes the importance of taking the initial strength of a 
response into account. 

There is another feature of the experimental situation which Thorn- 
dike fails to take into account but which may be, at least in part, re- 
sponsible for his failure to find a direct effect of punishment. In most 
of the experiments, there was only one right response and several wrong 
alternatives. Thus it was the subject’s task to learn positively only 
one item, but to eliminate several items. By virtue of its uniqueness, 
the right response is “figural" (has, as it were, high visibility for the 
subject), whereas a wrong response is one in a long homogeneous series 
of wrong responses (213). These considerations receive experimental 
support from the work of Dand (45) who equated the number of right 
and wrong alternatives and presently found an announcement of Wrong 
to have a definite weakening effect. Lorge (162) reports that the potency 
of a punishment is less the higher the initial probability of obtaining 
a right response (though not reliably so). It is safe to say that the pat- 
tern and sequence of Rights and Wrongs are at least partial determinants 
of the fixation and elimination of responses. 

The complex ways in which the effects of Wrong vary with the param- 
eters of the experimental situation highlight the need for caution in 
generalizing about the effects of punishment. Even when the punish- 
ment consists of a simple announcement of Wrong, the experimental 
evidence is still inconclusive. An announcement of Wrong is only a very 
special type of punishment situation. Indeed, one may question the 
appropriateness of the term punishment or annoyer for the description 
of an effect which is primarily informative in nature. In most of Thorn- 
dike’s situations, the connections to be learned are purely arbitrary and 
it is easily obvious to the subject that being right or wrong cannot possi- 
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bly reflect on his intelligence. He may be neither gratified by being 
right nor annoyed by being wrong. Wrong may merely serve as a signal 
to substitute one arbitrary guess for another. We must reiterate the 
need to define satisfaction and annoyance independently before in- 
voking them as determinants of the learning process. 

Other Studies of the Role of Punishment 

When we turn to punishments other than an announcement of 
Wrong or slight monetary losses, the uncertainty of Thorndike’s gen- 
eralization becomes even more apparent. An impressive array of ex- 
perimental evidence can be marshalled in support of the proposition 
that punishment is an effective condition of learning. We shall first 
review the empirical findings and shall then consider the main theoreti- 
cal issues concerning the nature of punishment and the mechanisms 
of its action. 

Society has devised many and varied punishments for those who 
offend against law and convention but in the psychological laboratory 
punishment has come to be almost synonymous with electric shock. 
Shock is easily administered and it is always sure to have a physiological 
impact though its psychological effects are highly variable (296). Thus 
most of the data on punishment come from studies on electric shock. 

A review of the literature on punishment raises the following prob- 
lems : 

1. Does punishment have any significant effect on learning? 

2. Is the effect of punishment generalized to the entire learning situation or 
does it lead only to the elimination of specific punished responses? 

3. What is the mechanism by which punishment exerts its effects? Does 
punishment weaken stimulus-response bonds? Does it affect the general level 
at which the organism functions? Or is its effect indirect, leading to increased 
variability of behavior and ultimately to differential reinforcement by reward? 

4. Does the main effectiveness of punishment reside in the information 
about errors with which it provides the learner or does it have a more im- 
mediate influence on associative strength? 

Does punishment have any significant effect on learning? We have al- 
ready seen that Thorndike answers this question in the negative but 
that the validity of this conclusion from his data is doubtful. The 
answer to this question is, trite as it may sound, yes and no. It is rather 
unfortunate that Thorndike and his associates have tended to put the 
issue on an all-or-none basis. The question should not be, "does punish- 
ment have an effect on learning or does it not?" but rather, "under 
what conditions is it effective and under what conditions does it fail to 
show results?” 

Against the blanket assertion that punishment is not instrumental 
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in the elimination of wrong responses it is possible to cite a long list of 
papers covering more than half a century of experimental work which 
report that punishment is an effective condition of learning. 

Without attempting to review these experiments in detail, we shall 
summarize the main findings. 

1. A large number of different organisms have been spurred on to better 
learning by punishment: 10 earthworms (325), cockroaches (261, 300), dancing 
mice (324), rats (e.g„ 20, 54, 56, 57, 111, 116, 155, 193, 196, 199, 200, 201, 202, 
204, 206, 301, 307, 322), chicks (36), cats (53), and, last but not least, human 
beings (10, 12, 13, 19, 24, 25, 26, 27, 41, 55, 79, 80, 87, 137, 138, 172, 173, 217, 
218, 302). 

2. Punishment, usually by electric shock, has improved learning in a variety 
of problem situations: discrimination boxes, mazes, reaction-time procedures, 
serial learning, mirror tracing, multiple choice situations. 

3. Punishment has been found to affect different measures of learning. In 
many studies, in which responses are scored as either right or wrong, punish- 
ment has led to a decrease in the number of errors (10, 13, 24, 25, 26, 27, 36, 41, 
54, 55, 56, 57, 79, 80, 87, 111, 116, 137, 138, 172, 173, 193, 196, 199, 200, 201, 
202, 204, 206, 218, 301, 302, 307, 326). When speed of learning is measured in 
terms of the number of trials required to reach criterion, we find that punish- 
ment often results in a smaller number of trials (19,24,25,26,27,36,41,54,55, 
111, 137, 155, 171, 193, 196, 199, 201, 202, 204, 206), Some investigators report 
the fact that punishment reduces the time required for learning (24, 25, 41, 55, 
79, 80, 87, 206, 302) but time has proved to be a difficult measure to interpret 
since punishment typically leads to a cautious hesitant attitude on the part of 
the learner so that a decrease in the number of trials and errors may be accom- 
panied by an increase in time per trial (10, 24, 25, 26, 172, 173, 206, 218). 

4. The effectiveness of punishment may also be demonstrated by retention 
tests. There are experimental reports suggesting that (1) punished groups 
remember the learned task better (27, 41, 56, 79, 80, 81, 206, 301), (2) punished 
groups are less subject to retroactive inhibition (27). Different measures of 
retention show the effect of punishment to unequal degrees, however (41, 81, 
301). 

On the basis of the experimental evidence it is thus possible to assert 
at least that punishment works in some situations some of the time. 
But note the qualifier some. None of the papers cited would justify the 
conclusion that punishment works in all situations all of the time. On 
the contrary, the experimental evidence emphasizes the extent to which 
the effectiveness or non-effectiveness of punishment depends on the pa- 
rameters of the experimental situation. 

1. The intensity of the punishment is important. As Yerkes and Dodson 
(326) pointed out more than forty years ago, and many investigators since have 
confirmed, there is in a given learning situation an optimal intensity of punish- 
ment. The relationship between intensity of punishment and learning efficiency 
is not linear over the total range of intensities. Once the optimal intensity has 

10 Conditioning studies employing a punishing UcS are not included here. 
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been exceeded, efficiency of learning may decrease (36, 53, 87, 302, 326, 328), 
There is moreover, no one optimal intensity for any given organism. The 
optimal intensity will vary significantly with (1) the difficulty of the task and 
(2) the stage of learning at which punishment is applied. 

2. There has been rather general agreement that relatively severe punish- 
ment (intensive shock) is most effective in the learning of simple habits such as 
black-white discrimination and simple maze patterns, and that relatively mild 
punishment is optimal in the case of difficult tasks such as more complex types 
of discrimination and exacting maze patterns (12, 26, 36, 53, 326, 328). 

3. The effects of shock may become disruptive (12, 79, 80, 87, 294, 303), 
especially if applied early in the learning process (25, 206, 218). Shock may also 
lead to increased variability of responses as compared with milder forms of 
punishment (260). 

4. As to the stage of learning at which punishment is most profitably in- 
troduced, we have already referred to Stephens’ general conclusion that the 
greatest effect is achieved with strong connections (254, 256). This conclusion 
is borne out by the findings of Valentine (301) and by a systematic investiga- 
tion of Bunch (25) who got better results with a few shocks late in the learning 
than with many more administered earlier in the process. The results of his 
experiments led Bunch to believe that for a given task there is an optimal com- 
bination op position in learning and number of trials with shock. This generaliza- 
tion clearly reflects the dependence of the effect of punishment on the experi- 
mental parameters. 

Any generalization about punishment based on a single experiment, 
or even a series of experiments, thus becomes extremely hazardous. For 
a given organism, there are combinations of conditions under which 
punishment is effective. If the experimenter works outside this range of 
combinations, punishment fails to yield significant results (Thorndike 
and co-workers). If he works with several combinations of conditions, 
he may easily find that punishment is effective in one case but not in the 
other and certainly that it varies in effectiveness (12, 26, 87). If he 
works with only one set of parameters, he may be tempted to over- 
generalize. Such seems to have been the case in Thorndike's experi- 
ments on the effect of Wrong since most of the connections subjected to 
punishment were probably rather weak (254), and the punishment 
hardly very intensive in nature. Such lack of consistency in results need 
not lead to a counsel of despair regarding the possibility of generaliza- 
tion about punishment, but merely to an insistence that any such gener- 
alization should be based on functional relations covering a wide range 
of parameters and not on isolated, restricted experiments. 

A special problem of control needs to be mentioned here. In an 
attempt to hold as many of the experimental conditions as possible 
“constant," the experimenter may decide on a certain type and intensity 
of punishment and then adhere to it throughout the course of the experi- 
ment. He would then overlook the fact that the effectiveness of a 
punishment changes as a function of the number of times that it has 
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been administered. In the case of shock, there are clear-cut experimental 
data to substantiate this point. Investigating electric shock as a moti- 
vating stimulus in conditioning experiments, Kellogg (143) found that 
in order to maintain a flexion reflex of fixed extent it is necessary to ad- 
minister relatively large voltages at the beginning of the series and to 
reduce the stimulus intensity toward the end of the series. He also 
found that the intensity required varied with the distribution of prac- 
tice. In addition, Kellogg reported wide individual differences among 
his subjects (dogs) in sensitivity and emotional reaction to the same 
physical shock. He concluded: 

It seems reasonable to infer that in learning experiments where the physical 
properties of an electrical stimulus are constant, the effects produced in the 
same subject at different times must differ greatly. The attempt to maintain 
"constant motivation" by this method probably defeats its own purpose (143, 
p. 95). 

Considerable individual differences in reactions to shock have also been 
reported for human subjects (296). Too few experimenters have heeded 
Kellogg's admonition to consider the effectiveness of a shock at a given 
point in time rather than its physical intensity. 

Is the effect of punishment general or specific? In the framework of 
connectionism, it is always a specific bond which is strengthened or 
weakened. Thus Thorndike and his co-workers have always concerned 
themselves with the history of specific rewarded and punished responses, 
finding punishment ineffective. Other investigators have found, how- 
ever, that punishment may have a much more general effect: there may 
be a general improvement in performance, with unpunished responses 
benefiting along with the specifically punished ones (10, 12). Even if 
the punishment itself is made non-specific (“non-informative shock”), 
and not administered as a consequence of any particular response, there 
is still a facilitating effect (11, 12, 19, 25, 79, 80). The work of Muen- 
zinger is of especial importance here. In a series of carefully controlled 
experiments he was able to show that (1) shock administered to rats 
after the choice-point in a maze facilitates learning and is more effective 
than hunger-motivation alone (193, 196); (2) shock before the choice- 
point decreases the efficiency of learning 11 (204) as does shock at the 
moment of choice (71); and (3) if shock is administered throughout the 
maze, its effect is only slightly less than with punishment after the 
choice point (199). These findings apply to shock for right responses as 
well as shock for wrong responses. When both right and wrong re- 
sponses are shocked, there is no summation of effects though learning 
efficiency is greater than if either right or wrong choices alone are fol- 
lowed by shock (204). 

11 This retardation is probably due to the persistence of position habits which become 
fixated (70, 101). 
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The question asked at the beginning of this section, whether the 
effect of punishment is general or specific, turns out to bean unwise one. 
Clearly, the effect may be general, specific, or both, depending on the 
way the experiment is arranged and the measurements are made. In a 
conditioning experiment, for example, in which punishment is employed 
for the purpose of establishing a discrimination, the effect is of necessity 
highly specific. A response to, say, one tonal frequency is reinforced, 
and the responses to other frequencies are weakened or eliminated. It 
would be obviously -nonsensical to expect unpunished responses to be 
affected in the same way as punished ones. At the other extreme of the 
continuum of specificity-generality is a situation in which punishment 
is administered without reference to any particular response, say, after 
a block of trials. In that case, the only possible effect that punishment 
can have is general, inducing an attitude of caution, serving as an in- 
centive, etc. The general conclusion on the generality-specificity issue 
is that punishment has both general and specific effects, sometimes one, 
sometimes the other, sometimes both, depending on its role in the total 
experimental situation. 

What is the mechanism by which punishment exerts its influence? There 
has been considerable speculation about the events in the organism 
which mediate the effect of punishment. This is, of course, an example 
of the search for intervening variables. Punishment is the independent 
variable, the effects of punishment (positive and negative) are the de- 
pendent variables. In terms of what hypothetical events can the re- 
lationship between these two variables be conceptualized? The answers 
which investigators have suggested fall into three broad classes: (1) 
mechanisms of action which are held to be specific to punishment, (2) ex- 
planations which reduce whatever effect punishment may have to the 
ultimate action of rewards, and (3) explanations which ascribe the ef- 
fectiveness of punishment to the information or perceptual emphasis 
which it provides. 

1. Mechanisms specific to punishment. Throughout his writings on punish- 
ment Thorndike has referred to the weakening or stamping out of stimulus- 
response bonds. Such weakening effects as he has found were ascribed to 
processes detrimental to the conductivity of the neurones. For the last fifteen 
years, however, he has argued against significant effects of punishment and 
therefore has not further concerned himself with a mechanism to account for 
the weakening of specific S-R bonds by punishment, putting the entire burden 
of explaining learning on the action of rewards. 

An important suggestion as to the mechanism by which punishment affects 
the emission of specific responses comes from students of operant conditioning 
(68, 237). Their experiments deal with the effects of negative reinforcement 
(punishment) on the lever pressing behavior of rats. The general conclusion is 
that punishment results in a suppression rather than weakening of a response. 
Administration of punishment temporarily depresses the rate at which responses 
are emitted, but the suppressed behavior will be released when circumstances 
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are more favorable. In terms of Skinner’s concept of reserve, punishment 
affects the relation between reserve and rate of emission of responses but has 
no effect upon the reserve itself. The effect of punishment is therefore emo- 
tional, leading to a temporary depression of function but not to a permanent 
weakening or elimination of responses. These findings not only serve to em- 
phasize the differential effects of reward and punishment on the emission of 
responses but also convey an important methodological warning: the strength 
of a response at the end of a period of punishment cannot be used as a reliable 
index of the true long-range strength of the response (68). Punishment is not a 
mechanical stamping-out process. Its effect must be considered in relation to 
the experimental situation. 

The emotional effect of punishment is also emphasized in Guthrie’s attempt 
to account for the elimination of wrong responses (90). For Guthrie, the funda- 
mental condition of all learning is conditioning by contiguity in time. It is 
movements , not acts, which are conditioned to stimulus cues. The more aroused 
the organism, i.e. the greater the emotional excitement, the more varied and 
intense the movements which occur. These varied and intense movements pro- 
duce a correspondingly high degree of proprioceptive stimulation. Propriocep- 
tive stimuli are important cues to which responses are conditioned. Hence 
punishment facilitates conditioning (of withdrawal responses, for example) not 
because punishment is annoying but because punishment arouses the organism 
and provides a multiplicity of cues for the establishment of conditioned re- 
sponses. Guthrie insists that the action of punishment cannot be in any way 
clarified by referring to the action of annoyers. Suppose we define annoyance as 
a state of affairs which the animal avoids or changes. “But this ability to avoid 
is just what is necessary to explain unless we assume that learning has already 
taken place. If annoyance means avoidance, we had to learn to be annoyed at 
annoyers (by emotional reinforcement). It is what the punishment makes an 
organism do that counts not what it makes him feel!” (90, p. 14). 

From what may be called stimulus and/or response explanations of punish- 
ment we turn now to explanations which put the main emphasis on the changes 
wrought by punishment in the subject’s approach to the learning situation and 
his general manner of performance. We are dealing here with qualitative de- 
scriptions of performance under punishment. Several investigators have 
stressed the attitude of caution which the subject assumes when he is punished 
for incorrect responses. The learner is circumspect, he steps warily (10, 24, 25, 
26, 172, 173, 218). Muenzinger has found that a punishment or obstacle makes 
the subject pause before he proceeds with his response (196, 200, 202, 203). 
During this pause he is exposed to relevant stimulus cues. In this regard shock 
and an enforced delay (200, 203) or an obstacle such as a gap (202) after the 
choice-point are equivalent. According to Honzik and Tolman (116) a state of 
heightened vigilance may make this increased exposure to stimulus cues espe- 
cially effective. Thus punishment is described as making the organism more 
sensitive and vigilant, and as enhancing the animal’s opportunity to learn about 
the situation and indulge in successful vicarious trial-and-error. Similar qualita- 
tive observations describe punishment as raising the general incentive level 
(57, 79, 80). 

2. Explanation of effective punishment in terms of reward. When Thorndike 
jettisoned punishment as a condition of learning, he left open the possibility 
that it may have an indirect effectiveness by speeding up the occurrence of acts 
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which are rewarded. Thus punishment may be a condition for the eventual ap- 
plication of reward and in this way a determinant of learning, though one step 
removed from the locus of the critical event. Hull's theory of primary reinforce- 
ment which makes all learning contingent on need-reduction clearly implies 
such an approach to punishment. The reduction of the effect of punishment to 
the action of reward has been made explicit in the writings of Mowrer (181, 183, 
184, 185, 186, 188). According to the conception which Mowrer presents, all 
motives involve tension or discomfort. Thus there is discomfort or tension due 
to hunger just as much as there is discomfort or tension due to shock. In this 
sense, all learning, dependent as it is on motivation, involves punishment, the 
punishment inherent in the motivating tension. Given the motivation (tension), 
learning occurs when the tension is reduced. Tension may be reduced either 
through the occurrence of a consummatory response (e.g., reduction of hunger 
by feeding) or through escape from punishment (e.g., escape from shock). In 
Mowrer’s own words, “ ... it is therefore meaningless to say that one type of 
learning is ‘through reward’ and another type is ‘through punishment’; each is 
an essential aspect of a single dynamic process” (184, p. 424). Similarly 
Muenzinger (199) argues that it is "poor logic and bad science” to compare 
learning by reward and learning by punishment. Without a primary state of 
imbalance and its subsequent reduction there can be no learning. Punishment 
is merely a way of increasing this imbalance or tension in order to make the 
subsequent need reduction more effective as a reinforcing state of affairs. 

Such a conceptualization assigns to punishment a much more important 
systematic role in learning than does Thorndike’s. For Thorndike, the effects 
of punishment are variable and unpredictable. Punishment has no vital place 
in his picture of the learning process. In the view represented by Mowrer and 
Muenzinger, on the other hand, punishment is at the very fountainhead of 
learning. For punishment induces tension, and without tension the behavior 
which leads to learning by reinforcement would not occur. Thus the law of 
effectagain becomesalawof reward and punishment. Ithas, however, undergone 
a crucial transformation. Formerly either reward or punishment was believed 
to be a determinant of learning, and punishment was, as it were, reward with 
a negative sign attached to it. The modern version is a law of reward after 
punishment, for punishment provides the stimulus which makes a reward a 
reward. Both reward and punishment are integral parts of the learning situa- 
tion, provided they occur in the proper sequence and are functionally connected 
with each other. This analysis of the role of punishment, which Mowrer has 
defended on the basis of his experimental work, has all the beauty and all the 
dangers of simplicity. Here is an integrated, coherent account of learning, 
granted the basic assumption that learning cannot occur without initial tension and 
subsequent tension-reduction. It is precisely the granting of this assumption 
which is one of the controversies with which modern learning theory abounds. 
Thus we have come in our discussion to the same parting of the ways to which 
our discussion of reward has led us. In which direction progress lies depends on 
whether or not tension and tension-reduction can stand lip as necessary (though 
not sufficient) conditions of learning. 

3. Punishment or emphasis? In general, an individual is subjected to punish- 
ment only when he has committed a wrong. In learning experiments, the same 
procedure has usually been followed. The subject is punished when he commits an 
error. Sometimes it follows unequivocally from the nature of the situation what 
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constitutes an error (e.g., blind alley in a maze), at other times the experimenter 
decides arbitrarily which responses are to be considered right and which ones 
wrong (e.g., in learning word-number combinations in a Thorndikian experi- 
ment). Thus errors and punishment have been associated in investigations of 
the law of effect until Tolman, Hall and Bretnall (294) decided to explore the 
effect of punishment for right as well as for wrong responses. If punishment 
does, indeed, weaken associative connections, then punishment for correct 
responses should slow up the learning process, whereas punishment for wrong 
responses should result in faster learning. These expectations were not con- 
firmed. Tolman, Hall, and Bretnall found that (1) adding a shock to an audi- 
tory signal for right responses did not slow up learning significantly and (2) 
adding a shock to an auditory signal for wrong responses did not speed up 
learning but on the contrary slowed it down. Reinforcement of right responses 
was always more effective than reinforcement of wrong responses. These find- 
ings were so strikingly at variance with what the traditional law of effect would 
have predicted that the article reporting these findings was entitled "A disproof 
of the law of effect," 12 

Tolman, Hall, and Bretnall believed that the reinforcing stimuli — both 
auditory signal and shock — do not stamp in or stamp out responses but rather 
serve to emphasize (perceptually or cognitively) the reinforced responses. 
Moreover, an emphasis upon correct responses favors learning whereas emphasis 
on wrong responses does not. Any relatively violent emotional stimulus may 
in addition have disruptive effects which will counteract the favorable influence 
of emphasis. Other experimenters took up the idea of administering punish- 
ment for right as well as wrong responses. The work of Muenzinger has already 
been referred to. In the first of his series of experiments he found that punish- 
ment for right responses and punishment for wrong responses were equally 
effective (193), In a subsequent repetition of his experiment he found shock for 
wrong responses inferior (196). In one experimental variation, shock was ad- 
ministered for correct choice, incorrect choice, and at the time of feeding. All 
three conditions accelerate learning significantly (57), As in other studies of 
punishment the intensity of the shock relative to the task is a crucial param- 
eter (87). 

The experiment of Tolman, Hall, and Bretnall was of considerable 
systematic significance because it shifted attention from punishment as 
an automatic mechanism to punishment as a perceptual event. The 
learning task may be regarded as a problem which the learner tries to 
solve. Anything which helps the learner to distinguish correct from in- 
correct responses facilitates learning. A punishment may merely serve 
to designate a given response as either right or wrong at the pleasure of 
the experimenter. 

Information Versus Effect 

The efficacy of punishment for right as well as wrong responses is 
closely linked to a general problem which has been repeatedly raised in 

12 Goodenough (84) criticized Tolman’s findings on statistical grounds but later ex- 
periments have tended to confirm Tolman's results (119, 214, 234). 
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discussions of the law of effect, Can the efficacy of rewards and punish- 
ments be explained in terms of the information which they convey to 
the subject about the correctness or incorrectness of responses? If the 
effect of reinforcement were always mediated by information, it would 
not be possible to speak of a direct effect of satisfaction (need-reduction) 
or annoyance on stimulus-response connections. Rather, the hypo- 
thetical sequence of events in the organism would then become: 
stimulus-response-reward or punishment-information-strengthening (or 
weakening) of the association. In such a description, the mechanism 
by which information leads to the strengthening of associative connec- 
tions would be left open. One could, like Tolman, think of the formation 
of sign-gestalt expectations (288, 291) or assert that information leads 
to tension-reduction and that learning through information is only a 
special case of the law of effect. The problem of effect versus informa- 
tion has not always been posed in all-or-none terms. Granting that both 
factors may be operative, investigators have been interested in gauging 
their relative importance. 

Information is a concept which does not easily lend itself to experi- 
mental ayalysis. There is always the vexing possibility that what the 
experimenter considers to be mere information may act as reward, and 
conversely that what the experimenter believes to be a reward (need 
reduction) may serve as a source of information. There may, moreover, 
always be self-administered information (229). Nevertheless, experi- 
mental situations have been designed which at least made it possible 
to speculate about the role of information in the learning process. These 
experiments have been concerned with (1) clearly non-informative after- 
effects, (2) the question whether a learner need necessarily be aware of 
what he is learning, (3) the relevance of after-effects to the connections 
which they strengthen. 

Non-informative after-effects. In our discussion of punishment ex- 
periments were cited which show that punishment administered without 
reference to a specific response facilitates learning. Similarly, rewards 
given after a block of trials and not following a particular correct re- 
sponse have been found effective (60, 281). In general, however, effects 
which give specific information have been found superior to non-inform- 
ative ones (60, 224, 298). 

The study of “neutral” after-effects has a direct bearing on the prob- 
lem of effect versus information, An after-effect is “neutral” if it neither 
rewards, punishes, nor informs. Nonsense words, meaningful words, 
flashes of light, and clicks have been used as after-effects which merely 
happen after an S-R connection, presumably without telling the subject 
anything about the correctness of his response. Several investigators 
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have found that such after-effects strengthen responses to some extent 
(165, 166, 255). The very ambiguity of such after-effects has made it 
possible to interpret them in different ways. Thorndike and his co- 
workers believe that ambiguous after-effects may be interpreted as re- 
warding by the subject and hence function as rewards (165, 166). 
Stephens (255), on the other hand, suggested that the mere fact that 
something happens after a response, that the connection is “attended to” 
facilitates learning. Something happening after a response may, however, 
be interpreted by the subject as conveying information. “Neutral” 
after-effects have repeatedly failed to influence learning when the factor 
of information was carefully controlled (37, 38, 298). 

It seems that the use of neutral after-effects is a highly unsatisfac- 
tory procedure. The fact that an experimenter considers an after-effect 
neutral does not, of course, guarantee that the subject interprets the 
event in the same way. It seems puzzling that some experimenters first 
call an effect neutral, then turn the tables on themselves and assert that 
it was not neutral after all but rewarding. If it is neutral after-effects 
that we want to study, we should make sure that the effects are indeed 
neutral and then consider them neutral throughout, On the other hand, 
if we are interested in the effect of rewards, the study of ambiguously 
neutral after-effects provides a very devious avenue of approach to the 
problem. 

Awareness of what is being learned. If learning by reward can take 
place while the learner is not aware of what he is learning, then certainly 
effect and information could not be considered coextensive. Thorndike 
repeatedly claimed that rewards are effective even if there is little or 
no opportunity for inner rehearsal of the right responses (266, 269, 275), 
though he has been contradicted on that point (37, 298). Thorndike 
and Rock (282) then offered what they consider conclusive proof that 
learning without awareness of what is being learned can, indeed, take 
place. 13 In their experiment, learning depended on the discovery of a 
principle by the subjects. Yet there was only gradual improvement 
under reward and no sudden increase in successful responses. It seemed 
that subjects achieved insight without knowing that they did. The 
interpretation of this experiment depends entirely on the assumption 
that gradual improvement indicates lack of awareness. It turned out, 
however, that even subjects who are explicitly taught the principle on 
which their learning depends still may show gradual improvement (131). 
The point here is that understanding a principle and using it are two 
different things. Slow improvement may reflect lack of insight but also 
may be due to the learner’s inability to translate his understanding into 
smooth, efficient action, This attempt to prove learning without aware- 
ness must be considered doubtful. On the other hand, positive evidence 

13 For other experiments demonstrating that subjects can learn without being aware 
of what they learn, see Thorndike (269, 276). 
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has been offered for the importance of awareness (104) and knowledge 
of results (7, 82, 140, 225) for successful learning. Whether or not learn- 
ing without awareness of what is being learned can take place must re- 
main an open issue. 

The role of relevance. A satisfying after-effect, according to Thorn- 
dike, does not act logically but in a mechanical way. It is likened to a 
natural force applied to do work and need not, therefore, be relevant to 
the activity of the organism at the time at which the reward is given. 
Suppose a subject is engaged in activity directed toward goal A. In the 
course of his trial-and-error behavior he is given a reward which leads 
not to goal A but to an irrelevant goal B. Nevertheless the response 
which led to the reward is reinforced. Such findings have been reported 
for both human and animal subjects (164, 271, 274, 276), casting doubt 
on the theory that rewards function primarily as sources of information 
relevant to the learning task. How conclusive are such experiments on 
the role of relevance? Relevance and irrelevance are terms which refer 
to the experimenter's interpretation of the situation. The subject’s in- 
terpretation and the wants under which he operates are an altogether 
different matter. The only conclusion which can be drawn from such 
experiments is that responses which occur in temporal proximity to a 
reinforcing state of affairs are strengthened (Hull). Relevance is a 
normative term reflecting the experimenter's judgment. 

Even if we agree that relevance is a useful dimension for the descrip- 
tion of reward, the evidence on the role of relevance remains inconclu- 
sive. There are strong indications that, at least in some experimental 
situations, rewards irrelevant to the subject’s motivation fail to facili- 
tate learning. The work of Wallach and Henle (305, 306) is a case in 
point. In a typical Thorndike situation (learning word-number combi- 
nations) the subjects were informed that they were participating in an 
experiment on extra-sensory perception and that responses called right 
on a given trial might or might not prove correct on subsequent oc- 
casions. These instructions rendered the rewards irrelevant to the sub- 
ject's task — guessing numbers by ESP. As a result, rewarded responses 
were not repeated more frequently than punished ones, nor did these 
experimenters obtain a level of repetition of wrong responses at all com- 
parable to that reported by Thorndike. Wallach and Plenle conclude 
that "with the learning motive eliminated no automatic effect of reward 
seems demonstrable.” Their argument is further strengthened by the 
fact that a change in instructions, telling the subject that responses 
would no longer vary in a random fashion caused a highly significant 
increase in the number of rewarded responses that were repeated. 

The results reported by Wallach and Henle strongly suggest that 
rewards may not act as illogically and blindly as Thorndike’s findings 
had indicated. The subject’s attitude toward the learning situation needs 
to be taken into account. In the absence of explicit instructions not to 
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learn, subjects will by and large instruct themselves to learn and utilize 
whatever cues the situation provides. A reward, even an irrelevant re- 
ward, is such a cue because a rewarded response is different from the 
majority of responses which remain unrewarded. Thus there is a tend- 
ency to follow rewards and to consider them relevant if not dissuaded 
by instructions to the contrary. Wallach and Henle have clearly shown 
that a change in attitude can resist the “mechanical” impact of a re- 
ward, Investigators of human conditioning have long recognized the 
importance of subtle attitudinal factors (108). Attitudinal factors are 
probably of equal importance in the study of S-R connections by Thorn- 
dike’s technique. 

The issue of information versus effect remains unsettled. The diffi- 
culty is that it has not been possible to prove conclusively that rewards 
do not yield information even when experimenters hope and believe 
that they do not. An experimenter provides what he considers a non- 
informative reward but the subject may extract information from it, 
especially if he is a subject capable of symbolization. 

The Spread of Effect 

There is one pehnomenon which more than any other has bolstered 
the view that rewards act mechanically and blindly. This phenomenon 
is the spread of effect. Again the pioneer investigation was carried out by 
Thorndike. In 1933 he published evidence that a reward strengthens not 
only the connection which it directly follows and to which it belongs 
but also the connections which precede and follow the rewarded response 
(270). The closer in the series an item is to the rewarded connection 
the more it benefits from this spread of effect, Thus there is a double 
(: before and after) gradient of effect, with the items preceding the reward 
showing somewhat less strengthening than those following the reward. 
Degree of spread depends primarily on the number of serial steps sepa- 
rating an item from the reward and not so much on sheer temporal 
proximity, 

Thorndike considered this finding an independent proof of the law of 
effect (272). The discovery of the spread phenomenon gave Thorndike 
new confidence not only on backward action but also in the mechanical 
action of satisfiers. Again in his own words: 

The satisfier acts (upon a neighboring punished connection) unconsciously 
and directly, much as sunlight acts upon plants, or an electric current upon a 
neighboring current, or the earth upon the moon. From a satisfier issues a 
strengthening force which the connections absorb. Which of them will absorb 
it depends on the laws of nature not of logic and teleology (270, p. 48). 
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As to the particular mechanism responsible for the spread, Thorndike 
entertained two hypotheses: (1) the scatter hypothesis: the strengthening 
effect, being not logical but a biological force will sometimes miss its 
mark, striking preceding and succeeding connections. The gradient thus 
represents the decreasing probability of chance errors in the action of 
the satisfier. (2) The spread hypothesis : the confirmatory reaction caused 
by the reward may be diffuse, spreading out its influence over a range 
of items. 

Thorndike’s findings have been repeatedly confirmed with both 
human and animal subjects (15, 72, 134, 135, 136, 197) and the phe- 
nomenon became known as the “Thorndike effect." Agreement ceases, 
however, when it comes to an interpretation of the phenonenon. Dis- 
cussion has centered around a number of related questions: (1) Is the 
gradient of effect really double-winged or is the before gradient spurious? 
(2) Is the gradient of effect a gradient of variability? (3) Is the gradient 
a result of perceptual emphasis on the rewarded response rather than 
of the mechanical action of satisfiers? (4) Is the gradient of effect really 
a gradient of response habits? 

1. Is the gradient of effect bidirectional? For Thorndike one of the 
most important characteristics of the spread of effect is its bidirection- 
ality: rewards exert their influence both in the forward and backward 
direction. The interpretation of the before (backward) gradient runs 
into a methodological difficulty, however. A connection which precedes 
a reward also follows a reward. In Thorndike’s experiments rewards 
often followed each other rather closely. In criticizing Thorndike’s data 
Tilton therefore raised the question whether the backward gradient 
might not be a spurious function of a large forward one (286). He also 
pointed out that serial position needs to be taken into account with this 
type of material. Putting these considerations to the test — taking serial 
position into account and attempting to eliminate the influence of the 
forward gradient from the backward gradient — Tilton still found a 
double gradient but from both success and failure. In the case of the 
failure gradients, the backward one is more pronounced, while the for- 
ward gradient is the more pronounced in the case of success. Other in- 
vestigators also report the existence of a failure gradient (197, 332), 
The double-winged gradient is thus confirmed, it is true, but we find 
that the effect of Wrong spreads as well as the effect of Right. These 
results render doubtful Thorndike's theory of the scatter or spread of 
the confirming reaction. As an alternative explanation Tilton suggests 
that connections which the experimenter regards as discrete are not 
functionally separate. Items in a series may be sufficiently unified to 
be affected by success and failure as a unit (286). It is not entirely clear, 
however, why such “sequential unity” should manifest itself as a double 
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gradient. Further doubt is cast on this hypothesis by Zirkle’s finding 
that the rewarded item and the adjacent punished ones need not be 
qualitatively similar in order for the Thorndike effect to appear (331). 
Even more strikingly, Zirkle has demonstrated that it is the response 
adjacent to the rewarded item which is strengthened, not the adjacent 
S-R connection. When the relative order of wrong series items about 
a right item is shifted from one presentation to the next, there is no 
Thorndike effect when repetitions of S-R connections are counted. How- 
ever, a clear-cut Thorndike effect appears when repetitions of responses 
are counted by step-position alone without regard to the shifting po- 
sitions of the stimulus items. It has also been found that reward 
strengthens the early wrong responses to near-by stimuli not merely the 
last response (174). 

2. Spread of effect as a gradient of variability. There have been other 
reformulations of the Thorndike effect. Muenzinger and Dove (197) 
disposed of the scatter hypothesis which ascribes the spread of effect 
to inaccurate placement of reward and consequent uncertainty of recall. 
A gradient is present, with a steeper slope than usual, even if the suc- 
cessful response is learned beforehand so that there can be no possible 
confusion between the correct response and the surrounding wrong ones. 
For a gradient of uncertainty Muenzinger and Dove substitute a gradi- 
ent of uniformity or variability. Success produces a gradient of uni- 
formity: this is in essence the original Thorndike effect. Failure pro- 
duces a gradient of variability: not only does the wrong response itself 
tend to be varied but right responses near the wrong one are not re- 
peated with the same degree of uniformity as those farther away in 
the series. Spread of variability under punishment is also reported by 
Stone (260). This analysis still leaves open the question as to what 
intervening mechanism the changes in variability should be ascribed. 

3. Is the effect gradient due to percepttial organization? The system- 
atic differences in approach to the problem of effect in general are neces- 
sarily reflected in interpretations of the spread of effect, Again we find 
those who regard rewards and punishments primarily as perceptual 
events opposing those who think of the mechanical action of satisfiers 
and annoyers. The conditions under which the Thorndike effect gener- 
ally appears are especially favorable to an analysis in perceptual terms. 
The rewarded item is an isolated Right in a long homogeneous series of 
Wrongs, The announcement of Right may be regarded as a figure against 
a homogeneous ground of Wrongs. Wallach and Henle were the first to 
suggest that it is the extreme crowding of items which interferes with 
memory for wrong responses and contributes to the subject's tendency 
to repeat them (305, 306). The relation between perceptual isolation 
and the Thorndike effect was systematically investigated by Zirkle 
(332). He found a clear-cut positive relation between the degree of 
isolation of a rewarded response and the steepness of the effect gradient 
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(especially the after gradient). He also found a failure gradient around 
a perceptually isolated wrong response. On the basis of his experimental 
results Zirkle proposes a theory of isolation to account for the Thorndike 
effect. “A satisfier isolates tendencies which are at hand when it hap- 
pens . . . responses neighboring upon a key response tend to become 
isolated themselves because of their association with the key response” 
(332, p. 312f,). A theory of isolation is, of course, a perceptual theory of 
the spread of effect. That which stands out in a homogeneous series be- 
comes a focus of retention (similar to the von Restorff phenomenon). 
This interpretation is closely akin to Tolman’s law of emphasis. Thorn- 
dike and Zirkle thus clearly represent the opposition between an inter- 
pretation of the effect gradient in terms of “hit-and-miss” mechanical 
reinforcement on the one hand and a view stressing the laws of per- 
ceptual organization on the other. 

4. Is spread of effect a gradient of response habits? In terms of an 
orthodox connectionist analysis the Thorndike effect indicates that the 
influence of reward spreads to neighboring S-R connections. It is the 
S-R connection that is the basic unit of analysis. It is possible, however, 
to account for at least part of the effect gradient by an analysis of the 
response sequence, disregarding as it were the stimulus side of the S-R 
connection. We have already referred to Zirkle’s finding that a Thorn- 
dike effect can be demonstrated when responses are counted by step- 
position even though the stimulus order is changed from presentation to 
presentation. The problem of response habits was systematically at- 
tacked by Jenkins and Sheffield (136). They found that (1) when the 
rewarded response itself was not repeated few adjacent errors were re- 
peated and the Thorndike effect failed to appear; (2) when the re- 
warded response itself was repeated it was accompanied by a high level 
of repetition of errors and a typical effect gradient. Thus repetition of 
the rewarded response appears to be a necessary condition for the ap- 
pearance of the Thorndike effect. The spread of effect may therefore 
be due not to the automatic action of reward but to the subject’s tend- 
ency to repeat the same sequence of responses from trial to trial, i.e. the 
subject’s “guessing habits.” Responses are not independent of each 
other: choice of a response depends on the preceding responses. A 
repeated rewarded response ensures that "the errors following reward 
will be frequently preceded by the same response, i.e., will have a com- 
mon antecedent stimulus.” It is interesting to recall in this connection 
that preliminary rehearsal of the rewarded response yields an especially 
steep effect gradient (197) and that, on the other hand, a gradient fails 
to appear when the subject is not motivated to repeat the rewarded 
response (305, 306). Guessing sequences have also been reported with 
other types of multiple-choice responses (72), and there seems little 
doubt that at least part of the Thorndike effect can be analyzed in these 
terms, The demonstration of guessing habits does not necessarily dis- 



520 


LEO POSTMAN 


prove the automatic spread of reward as conceived by Thorndike. It 
may well be that both some "stamping-in” mechanism and response 
habits interact in the production of the Thorndike effect. 

As in the case of other conditions of reinforcement, it is necessary to 
stress the dependence of the spread of effect on the parameters of the 
experimental situation. As we have seen, the Thorndike effect fails to 
appear when subjects lack the motivation to learn (305, 306) though 
distraction does not seem to affect it (136). In animal subjects, increase 
in drive results in a higher frequency of repeated responses in general, 
with a consequent flattening of the after-gradient (134). Similar re- 
sults are obtained with an increase in incentive (135). Under such con- 
ditions, only responses preceding the reward yield a statistically reliable 
gradient. Such delicate dependence of the gradient on particular experi- 
mental conditions counsels against hasty analogies between the action 
of satisfiers and the "action of the earth upon the moon.” 

Parametric Studies 

We have repeatedly stressed the functional dependence of the effects 
of reinforcement on the parameters of the experimental situation. We 
now turn to a consideration of studies that are primarily concerned with 
the influence of such parameters of reinforcement. No exhaustive survey 
will be attempted but the major types of functional relationships will 
be illustrated. 


Amount of Reinforcement 

Thorndike's original statement of the law of effect included the as- 
sertion that "the greater the satisfaction or discomfort the greater the 
strengthening or weakening of the bond.” Demonstration of the corre- 
lation between amount of reinforcement and degree of learning has 
proved much easier with animal subjects than with humans. The moti- 
vation of animal subjects is more easily controlled and hence quantita- 
tive variations in reward are more effective. Increasing the food ration 
of a hungry rat affects behavior more drastically than adding 0.4 cent 
to the announcement of Right in a Thorndike experiment. 

Turning to animal studies first, there are a number of experiments 
with different species and different learning tasks which support the 
generalization that increases in reward lead to improvements in learning 
and performance. Chickens (86) as well as rats (42) run down a runway 
at a faster speed when the amount of food reward is increased. Incre- 
ments in reward also cause chicks to learn a maze with fewer errors 
(320), rats to show greater resistance to the extinction of a conditioned 
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response (77), and chimpanzees to tolerate longer delays between the 
presentation of the stimulus and the response (208). Although the 
functional relationship between amount of incentive and performance 
varies from situation to situation, it is clear that the relationship is not 
linear. Equal increments in incentive do not lead to equal increments in 
performance. Thus Crespi, who was especially concerned with the 
quantitative relation between amount of incentive and performance re- 
ports a sigmoid relationship over a wide range of variation (42). Hull, 
on the other hand, believes that the relationship is best represented by 
a simple positive growth function (127). 

The mechanism by which increments in reward exert their influence 
is still under discussion. Hull has suggested that an increase in the 
amount of reinforcement raises the limit to which the curve of habit 
strength approaches as an asymptote, although the rate of approach 
may be constant for all amounts of reinforcement, Hence increases in 
the amount of the reinforcing agent result in greater increments of habit 
strength per reinforcement. Hull also points out that in a conditioning 
situation the reward provides an important component of the stimulus 
situation which is conditioned to the response being reinforced. In such 
a situation a large reward stimulus evokes stronger, more vigorous, and 
more persistent responses than a small reward stimulus. Thus an in- 
crement in reward not only increases the amount of consummatory ac- 
tivity (need reduction) but also provides a more distinctive cue to the 
animal being conditioned (127). In this connection it is important to 
distinguish between sheer physical amount of reward and the amount of 
activity involved in the consumption of the reward, Wolfe and Kaplon 
were able to show that amount of reward and amount of consummatory 
activity are experimentally separable conditions of learning. Learning 
improves as a function of sheer amount of reward : a whole kernel of corn 
is a more effective incentive for chickens learning a maze than one- 
fourth of a kernel. However, learning also improves if the amount of 
consummatory activity is increased for a constant reward: when the 
whole kernel of corn is divided into four separate quarters the maze is 
learned better than with the equivalent amount given in one piece. Of 
the two factors, amount of consummatory activity influences learning 
to a greater extent than sheer amount of reward (320). 

The role played by amount of reinforcement has also been explained 
in terms of the subject's attitude rather than in terms of differential 
strengthening of S-R connections. Crespi has proposed a two-factor 
theory of incentive-value. He conceives of incentive-value as propor- 
tional to the distance between the subject’s level of expectation (both of 
quantity and quality of reward) and the level of attainment. Attain- 
ment which does not reach the level of expectation is frustrating and af- 
fects learning adversely, while attainment above expectation is "elating” 
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and improves learning. It is not the sheer amount of reward alone which 
is decisive nor the animal’s expectancy alone but rather the relation 
between the two. A striking illustration of this principle is the fact that 
rats will perform significantly better with no incentive at all than with 
a very small incentive. A very small incentive serves to whet the ani- 
mal’s appetite, raises his level of expectation and eventually leads to 
frustration. In accordance with the same principle, downward shifts in 
the amount of incentive lead to poorer and more variable performance 
whereas upward shifts improve the performance (42, 42a). Crespi’s con- 
ceptualization is reminiscent of the experiments on level of aspiration of 
human subjects. That the effectiveness of a given quantity of reward 
cannot be evaluated without reference to the subject’s expectancy is 
also stressed by Cowles and Nissen (40) on the basis of experiments 
on delayed responses in chimpanzees. Similarly Nissen and Elder 
(208) report that increases and decreases in amount of incentive affect 
not only the response within a given trial but succeeding trials as well. 
Such perseverative effects suggest the operation of reward expect- 
ancy. 

When human subjects are used, comparisons of different amounts 
of reinforcement yield variable and inconclusive results. Slight increases 
in rate of learning as a function of increases in reward — addition of 
small money gains to the announcement of Right — have been found in 
some Thorndikian experiments (GO, 223, 280). The effects are exceed- 
ingly slight if the amount of reward varies within the same series (223) 
and are somewhat more pronounced if incentives are changed from 
series to series (280). On the other hand, promise of a reward was found 
as effective as actual administration of the reward (78). In situations 
of this type, the experimenter never knows whether an increase in re- 
ward is experienced as such by the subject. Moreover, if different re- 
wards are presented in the course of an experiment, their effects may 
interact with each other. Whatever effects there are, may possibly not 
stamp in individual responses more firmly but rather affect the general 
level of the subject’s motivation and thus affect learning secondarily 

(171). 

Frequency and Pattern of Reinforcement 

The simplest hypothesis relating frequency of reinforcement with 
strength of learning is that each reinforcement adds to the strength of 
the S-R connection being reinforced. This hypothesis is plausible, at 
least at first blush, in the light of empirical data. The percentage of 
conditioned responses usually increases as a function of the number of 
reinforcements (108), human rote learning steadily improves on succes- 
sive trials (171). Indeed, it is difficult to think of learning situations in 
which an increase in number of trials does not result in better perform- 
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ance. 14 Frequency per se cannot be profitably considered a significant 
condition of learning. Rather, repeated pairings of stimulus and re- 
sponse allow effective conditions of learning, (such as simultaneous con- 
ditioning, reinforcement by need reduction, confirmation of expecta- 
tions, etc.) to exert their effects. There would probably be little dis- 
agreement with such a general formulation of the role of frequency. 
Agreement ceases when the assertion is made that each individual rein- 
forcement in a series of reinforcements contributes a differential incre- 
ment (say, A a H a in Hull’s language) to the existing habit strength. The 
theoretical question at issue is whether or not the effects of successive 
reinforcements are additive and cumulative. Is the law of effect a law 
of cumulative effect? 

The assumption that the effects of successive reinforcements con- 
tinuously cumulate in time is central to Hull’s theoretical system (127). 
He conceives of habit strength as a monotonic increasing function of 
the number of reinforcements. As habit strength approaches the sub- 
ject’s physiological limit, the increment from each reinforcement pro- 
gressively decreases in magnitude. Thus habit strength is a simple 
positive growth function of the number of reinforcements. Habit 
strength, as Hull uses the term, is a theoretical construct which can be 
measured only indirectly through its behavioral manifestations. There 
are a number of behavioral studies whose results are consistent with 
Hull’s quantitative picture of the growth of habit strength: 

1. Reaction amplitude increases as a function of the number of reinforce- 
ments (118). 

2. Reaction latency decreases with the number of reinforcements (235). 

3. The number of trials required for experimental extinction may be pro- 
portional to the number of reinforcements (209, 317, 329). 

4. The more frequently a response has been reinforced the greater the 
probability that the appropriate stimulation will evoke that response (162, 
269, 276, 329). 

Unfortunately not all experimental results available fit into Hull’s 
theoretical picture. Leeper (160) has criticized Hull for what he con- 
siders a biased selection of illustrative experiments. There is substantial 
experimental evidence that partial reinforcement, i.e., reinforcement on 
only a fraction of the trials rather than on each trial may be at least as 
effective as continuous reinforcement. Thus Humphreys showed that 
reinforcement on 50 percent of the trials is as effective in the establish- 
ment of a conditioned response as reinforcement on 100 percent of the 

14 A notable exception is Skinner's finding that a single reinforcement results in a 
series of lever pressings by the rat at an optimal rate (237). One reinforcement is suf- 
ficient to establish an adequate response strength. Further reinforcements serve to build 
up a reflex reserve, which is subsequently emptied in the absence of reinforcement 
(extinction). 
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trials and that partial reinforcement may result in greater resistance to ex- 
tinction than continuous reinforcement (128, 129). Humphreys believes 
that conditioned responses occur to the extent that the subject expects 
the reinforcing stimulus to follow the conditioned stimulus, extinction 
occurs to the extent that the subject no longer expects reinforcement. 
Partial reinforcement during the initial training period makes it difficult 
for the subject to shift, from expectation of reinforcement to expectation 
of non-reinforcement. In terms of the expectancy hypothesis it is neces- 
sary clearly to distinguish between frequency of trials and frequency of 
reinforcements as conditions of learning. The relation between these 
two frequencies determines the subject’s expectations and hence the 
course of acquisition and extinction of responses (130). The equal ef- 
fectiveness of partial and continuous reinforcement has, however, also 
been ascribed to the influence of secondary reinforcement during the 
ostensibly unreinforced trials (52). Such an explanation would be in 
conformance with a description of the reinforcement process as con- 
tinuous and cumulative. The relationship between number of trials and 
relative frequency of reinforcement is complicated by the phenomenon 
which Hovland has called "inhibition of reinforcement” (117). A mass- 
ing of reinforcements results in a weakening of conditioned responses, 
with spontaneous recovery within a short interval of time. Inhibition of 
reinforcement as well as expectancy have to be taken into account in 
the analysis of partial reinforcement (75, 76). 

The studies cited thus far by no means exhaust the evidence for the 
equal effectiveness of partial and continuous reinforcement. Brogden 
(.16) showed that conditioned flexion to shock was established as readily 
with 40 percent reinforcement as with 100 percent. Decrease in the 
frequency of reinforcement had the positive effect of eliminating a great 
deal of the animal's diffuse and restless behavior. When food rather 
than shock was used as the reinforcing stimulus substantially the same 
results were obtained, except that there was a slight decrease in the 
frequency of response (CRs on f of the trials) when reinforcement was 
applied only 20 percent of the time. Brogden concludes that application 
of the reinforcing stimulus serves primarily as an incentive to the subject, 
and thus a low frequency of reinforcement may maintain the conditioned 
response at a high level, 

Mowrer and Jones (187), who confirmed Humphreys’ findings, pre- 
sent an argument reconciling these results with the law of effect, The 
effects of a reward need not necessarily be restricted to the particular 
response that occurs just before the reinforcement. The reinforcement 
applies to preceding responses as well (though to a decreasing extent). 
If we think in terms of response units (sequences) each of which is fol- 
lowed by a reward rather than in terms of individual responses which 
sometimes are reinforced and sometimes are not, the apparent advan- 
tage of intermittent reinforcement disappears. On the contrary, in 



PRESENT STATUS OF THE LAW OF EFFECT 


525 


terms of a response-unit analysis, the intermittently reinforced group 
which has to expend more effort in order to obtain a reward gives fewer 
extinction responses. 15 The response-unit hypothesis of Mowrer and 
Jones points to the importance of considering the temporal pattern of 
reinforcement and not only the sheer frequency. Skinner's results with 
periodic reconditioning are a relevant case in point (237). When the 
lever-pressing response of the rat is reinforced periodically, e.g., every 
three minutes, the animal’s rate of response not only tends to become 
uniform, but also the more frequent the reinforcement the more rapid is 
the rate. On the other hand, when reinforcement is at a fixed ratio, i.e., 
when the final member of a fixed number of responses is reinforced, the 
less frequent the reinforcements the higher is the rate of response. Such 
laws of operant behavior can be analyzed only in terms of the total 
temporal pattern of responses and reinforcements (e.g., Skinner’s con- 
cept of reflex reserve). The importance of the temporal pattern is also 
illustrated by Brunswik’s finding that in discrimination learning proba- 
bility of success is an important determinant of the animal's behavior at 
a choice point (21). 

In summary, it is clear that frequency of reinforcement is an impor- 
tant determinant of the strength of learning. However, partial rein- 
forcement can be as effective as, and more effective than, continuous 
reinforcement. Reinforcements are not always simple additive units, 
and the temporal pattern of a series of responses and reinforcements gives 
rise to behaviors which cannot be predicted in terms of a simple mono- 
tonic relationship between frequency of reinforcement and strength of 
learning. 

Delay of Reward and Gradient of Reinforcement 

The degree to which reinforcement strengthens a response depends 
in part on the time interval that elapses between response and rein- 
forcement. Experimental inquiry has been directed at two interde- 
pendent problems: (1) the difference between immediate reinforcement 
and delayed reinforcement and (2) the exact quantitative relationship 
between length of delay and strength of learning, i.e., the nature of the 
gradient of reinforcement. 

Thorndike early suggested that satisfying states of affairs are the 
more effective the closer they are temporally to the S-R bond. “Other 
things being equal, the same degrees of satisfyingness will act more 
strongly on a bond made two seconds previously than on one made two 
minutes previously ..." (265, p. 172), A few of the early investigations 
of delayed reward failed to bear out this prediction (30Q, 313), but it is 
now clear that in these experiments secondary reward was not con- 

15 Mowrer and Jones (187) also point out that the law of effect is compatible with 
Humphreys’ expectancy hypothesis. The fulfillment of an expectation may serve as a 
reward (need reduction). 
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trolled — the animal subjects were delayed in the food chamber of a maze 
— so that there was no effective delay of reinforcement. When these 
first attempts are discounted, the experimental literature shows general 
agreement on the superiority of immediate over delayed reward. The 
detrimental effects of delayed reinforcement have been demonstrated 
with animal subjects in maze learning (34, 100, 318), problem box learn- 
ing (222), the formation of discrimination habits (35, 321), and in the 
establishment and maintenance of operant responses (210, 211, 237). 16 
Similar findings have been reported for delayed punishment (32, 141, 
192, 308, 311, 321, 323). 

Different investigators used different periods of delay but there were 
strong indications that the most serious detrimental effects were con- 
centrated in the first minute of delay. Formal quantification of the 
functional relationship between length of delay and strength of learning 
are found in the writings of Hull (127) and Perin (210, 211). On the 
basis of theoretical calculations, which fit a considerable amount of 
empirical data, Hull concludes that (1) habit strength is a negative 
growth function of the time separating the response from the reinforce- 
ment and (2) the asymptote of this gradient is zero, i.e., with a suffi- 
ciently long delay reinforcement becomes ineffective. In the case of the 
rat, the gradient reaches zero at a delay of about 30 seconds (210). This 
rather short gradient of habit strength constitutes the gradient of rein- 
f or cement . u It is important to bear in mind that the gradient of rein- 
forcement refers to the effects of different intervals between single re- 
sponses and that it does not apply to one of a series of responses leading 
to a common reward (108). In the case of a series of responses, e.g. , in a 
maze, the gradient is more extended and more complex, and is generated 
by the “summation of an exceedingly complex series of overlapping 
gradients of reinforcements, in part consisting of, but largely derived 
from, the 'primary’ reinforcements occurring at the end of the temporal 
period covering the behavior sequence involved” (127, p. 143). In other 
words, more and more members of a stimulus series acquire (secondary) 
reinforcing power, each probably in accordance with the short gradient 
of reinforcement. The summation and interaction of these short gradi- 
ents result in the extended goal gradient. Hull believes that the goal 
gradient is an exponential or negative growth function (not, as he had 
originally believed, a logarithmic function). The greater the influence 
of secondary reinforcement, the less steep the slope of the goal gradient, 
i.e., the greater the temporal range over which a reinforcement can exert 
its influence. 

18 Skinner (237) found that delays are detrimental only after periodic reconditioning 
whereas delays up to 4 seconds did not affect the original conditioning of the lever- 
pressing response. 

17 The expression gradient of reinforcement was first used by Miller and Miles (178). 
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Hull’s goal gradient hypothesis ( 120 , 122 ) has proved to be a power- 
ful deductive tool by means of which a considerable amount of empirical 
results could be predicted, sometimes with striking accuracy. Among 
the findings which the hypothesis predicts and which were empirically 
confirmed are the following: 

1. With number of reinforcements held constant, when the delay of rein- 
forcement is short, less time is required to execute a response than under condi- 
tions of long delay of reinforcement (6). 

2. Of a pair of alternative acts the one which is reinforced with a shorter 
delay is chosen (5, 226). 

3. Other things being equal, the greater the difference in the delays of re- 
inforcement yielded by two alternative reactions, the more quickly the animal 
will learn to choose the act yielding the shorter delay (5). 

4. When the absolute differences in delay are equal, differentiation between 
two short delays is achieved faster than differentiation between two long de- 
lays. For example, a 30-second delay is more readily differentiated from a 60- 
second delay than a 60-second delay is from a 90-second delay (5, 85). On the 
basis of the goal-gradient hypothesis Hull has also been able to predict the 
effects of delays of equal relative, but of different absolute, duration (327). 

The gradient of reinforcement in Hull’s treatment is, strictly speak- 
ing, a temporal gradient. As Hilgard and Marquis (108) point out, such 
temporal gradients should be clearly distinguished from non-temporal 
gradients which refer to the spatial distance between a response and the 
reinforcement or the serial position of a response with respect to the 
reinforcement. Spatial separation or remoteness in a series, of course, 
implies temporal delay since it takes time to traverse the space leading 
to the goal or the members of a series ending in reinforcement. On the 
other hand, an animal delayed in a restraining compartment is in a 
very different situation from a subject that has to cross a runway in 
order to reach a food reward. The main difference lies in the fact that 
the spatial interval is filled with a series of acts leading to the reinforce- 
ment whereas sheer temporal delay may serve to disrupt the integration 
of a behavior sequence. The rat’s speed-of-Iocomotion gradient (122) in 
a straight alley is a spatial gradient, with running speed plotted against 
successive segments of the path leading to the reward (6, 122) or punish- 
ment (23, 176). Similarly, the extensive experimental investigations of 
the goal gradient in maze learning (120, 238, and many others) and of 
animals’ ability to discriminate short from long paths to reward (35, 50, 
85, 327) should be considered primarily as studies of the non-temporal 
aspects of the .gradient of reinforcement. In such experiments, the 
crucial experimental variable is the distance between a segment of the 
apparatus (runway, maze) and the locus of reinforcement. This dis- 
tance involves not only a temporal delay but also a complex sequence of 
intervening acts. 
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A notable example of a spatial gradient is, of course, the Thorndike 
effect. The close kinship between the Thorndike effect and the goal- 
gradient was demonstrated by Muenzinger, Dove and Bernstone (198). 
In an "endless" maze (four identical mazes arranged as the sides of a 
square, with food boxes at the four corners) a double-winged gradient of 
elimination of errors was obtained. These authors believe that the goal- 
gradient is bidirectional in its fundamental form and that the usual 
backward elimination of errors in a maze reflects only the first half of 
the gradient The other half cannot manifest itself because the animal’s 
activity usually ends at the goal. This analysis has not remained undis- 
puted. Hill (109) reports that the forward gradient appears only late in 
learning and ascribes it to the failure of anticipatory errors to be elimi- 
nated. In a more recent investigation, however, Thompson and Dove 
again report evidence for a basically bidirectional goal-gradient (262). 

With human subjects, gradients of reinforcement have usually been 
plotted as a function of serial position rather than in terms of sheer 
temporal delay. The various studies of the Thorndike effect are a case 
in point: the spread of effect is most clearly demonstrated when fre- 
quency of repetition is plotted against distance from reward in terms of 
response units (15, 270, 331), On the other hand, when time alone is 
considered, a 6-second delay is found to be as beneficial to learning as a 
0-second delay of reinforcement, and there is no evidence for a temporal 
gradient (167). The activity filling the interval between response an 
reinforcement is an important factor: when an announcement of Right 
follows an interval filled by another response, i.e. , if the reward refers 
to the next to last response, the reinforcement is virtually ineffective. 
Response and reinforcement must “belong" together. (Such detri- 
mental effects of interpolated activities are, of course, well known in the 
study of retroactive inhibition.) The fact that the gradient of reinforce- 
ment, which is readily demonstrated with animals, cannot be easily ap- 
plied to human subjects is of considerable theoretical importance. With 
the aid of symbols the human learner can bridge temporal gaps that are 
prohibitive to the animal (192). Such apparent independence of im- 
mediate reward has considerably complicated the application of the law 
of effect to human learning. 

Strength of Drive 

The modern version of the law of effect equates effect to drive re- 
duction. The strength of the drive could be expected to affect the opera- 
tion of the law in two ways: (1) strength of the drive could be one of the 
determinants of the speed of acquisition and (2) performance (habit evo- 
cation) may vary with the strength of drive. 

What is the influence of the strength of drive at the time of learning? 
An experimental investigation by Finan (73) showed that with the num- 
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ber of reinforcements constant an instrumental act is established more 
strongly when the drive is strong than when it is weak. A stronger drive 
yields a stronger habit although the relationship is by no means linear. 
This generalization has, however, not remained unchallenged. In Hull’s 
theoretical system, the course of acquisition (building up of habit 
strength) is described as independent of the strength of drive at the time 
of learning. Habit strength is a joint function of number of reinforce- 
ments, the time for which the conditioned stimulus has been acting be- 
fore the occurrence of the response to be learned, the time interval be- 
tween response and reinforcement, and the magnitude of the goal object 
(127, 144). Drive strength is not one of the variables of which habit 
strength is a function. In a recent study Kendler (145) has justified this 
omission of drive strength as one of the determinants of habit strength. 
In one of his experiments, animals learned a bar-pressing response under 
different degrees of thirst deprivation. The animals were, however, 
equated in strength of motivation during extinction. The results show 
that different drive conditions at the time of learning had not influenced 
the amount of habit growth, Kendler also showed that a group of sub- 
jects which had a low drive strength during learning but received a 
large number of reinforcements established a stronger habit than a 
matched group which learned under high motivation but received a 
smaller number of reinforcements. The results reported by Finan and 
Kendler are contradictory and the role of the strength of drive during 
acquisition must remain open. 

There is general agreement, on the other hand, that the strength of 
drive is an important determinant of performance. According to Hull 
reaction-evocation potentiality is a multiplicative function of habit 
strength and drive strength. This hypothesis receives experimental sup- 
port from Perin's finding that for a given number of reinforcements, re- 
sistance to experimental extinction is an almost linear function of the 
number of hours of food deprivation at the time of the extinction pro- 
cedure (209). In his investigation of operant conditioning, Skinner (237) 
found that rate of response is considerably affected by the animal’s 
drive: over a wide temporal range of food deprivation, increase in drive 
leads to faster rate of emission of responses (emptying of the reflex re- 
serve). 

Although performance varies with strength of drive, there are some 
indications that appropriate stimuli will evoke a response even when the 
subject is to all intents and purposes fully satiated (209, 330). Extra- 
polation from Perin’s theoretical curves relating resistance to extinction 
to drive strength predict this result. But here, too, the last word has 
not been spoken. Koch and Daniel (147) report zero or close to zero 
reaction potential immediately after satiation. Although the general 
dependence of performance on strength of drive may be considered as 
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well established, the precise quantitative nature of this relationship as 
well as the problem of interaction of different drives (144, 145) is still in 
need of further experimental analysis. 

The Place of the Law of Effect in Learning Theory 

Parametric studies such as those described in the preceding section 
derive their main significance from the contribution which they can 
make to general learning theory. We shall now consider the role played 
by the law of effect in theoretical interpretations of learning. 

Theories of learning can be classified in more than one way. One 
can distinguish between molar and molecular theories, according to 
emphasis on specific movements as against stress on acts and their 
outcomes; or one can pit modern associationism against configurational 
theories. The role assigned to reinforcement by reward and punishment 
provides another criterion of classification. There are clear-cut lines of 
division separating systematic points of view according to the role 
assigned to the law of effect: (1) effect may be considered to be a princi- 
pal condition of all learning, (2) effect may be rejected as a condition 
of learning but considered an important determinant of performance, 
(3) the essentials of the learning process may be conceptualized without 
reference to effect, with reward and punishment assigned a subsidiary 
role and credited with only indirect influence on learning. 

The Law of Effect as a Principal Condition of all Learning 

The law of effect could not become the pivot of a comprehensive 
theory as long as it was restricted to the narrow universe of multiple- 
choice (“trial-and-error”) learning. It was only with its application to 
the facts of conditioning that the law of effect could become a unifying 
principle around which a systematic theoretical structure could be built. 

For a considerable period of time students of effect and students of 
conditioning failed to make contact with each other. In the field of 
classical conditioning stimulus substitution was the primary principle of 
explanation. To the extent that substitution could not account for all 
the empirical data, auxiliary hypotheses such as the principle of domi- 
nance and such concepts as set and attitude were introduced (105, 106). 
On the other hand, Thorndike firmly maintained the distinction between 
learning by selection and fixation (formation of S-R bonds) and con- 
ditioning, which he termed “associative shifting.” Thorndike never 
believed that the study of conditioning could throw much light on the 
nature of learning: “The conditioned reflex is one type of learning that 
manages, even more completely than maze learning, to conceal the true 
nature of the learning process in a mass of special conditions” (267, p. 
85). In the light of this opinion it is not surprising to find that Thorn- 
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dike has almost completely ignored the facts and theories of condition- 
ing in all his writings. Students of conditioning, on the other hand, 
could not indefinitely ignore the problem of effect, for the crucial role of 
incentive in the establishment and maintenance of conditioned re- 
sponses was an incontrovertible experimental datum. 

Paralleling Thorndike's dichotomy between connection formation 
and associative shifting, several two-fold classifications of learning situa- 
tions have been made: classical and instrumental conditioning (108); 
Type S and Type R conditioning (236); conditioning and success learn- 
ing (228) ; quantitative and qualitative conditioning (216) ; conditioning 
and motivated learning (74). Such classifications reflect important dif- 
ferences in experimental procedures under which learning can take 
place, but, as Hilgard and Marquis (108) have emphasized, they do not 
represent pure types of learning which necessarily require principles of 
explanation as different as substitution and effect. In any given experi- 
ment, both types of learning may take place, and the classification of an 
experiment will largely depend on which aspects of the response the 
experimenter emphasizes and measures, A classical conditioning experi- 
ment emphasizes stimulus substitution and homogeneous reinforce- 
ment; an instrumental reward or escape experiment dramatizes the 
principles of effect and heterogeneous reinforcement. “It is a common 
error to permit the reference experiment to dramatize a particular proc- 
ess, and then to suppose that the experiment represents a pure case of 
the process dramatized” (108, p. 97). 

Thus, the distinction between learning by selection and fixation 
(heterogeneous reinforcement) on the one hand and learning by stimulus 
substitution (homogeneous reinforcement) on the other does not pre- 
clude a unified conceptual scheme which makes a law of effect the basic 
principle of all learning. It is the virtue of Hull’s theoretical analysis 
that these two types of learning are subsumed under a common princi- 
ple: they are both reduced to the operation of the law of primary rein- 
forcement (Hull’s formulation of the law of effect). The basic principles 
and generalizations of Thorndike and Pavlov are combined and unified 
in a single theoretical structure (246). Hull’s law of primary reinforce- 
ment makes temporal proximity to a reinforcing state of affairs a con- 
dition without which learning cannot take place. Both selective learning 
(the Thorndike situation) and conditioned-response learning are special 
cases of the operation of this law. In the case of simple selective learning, 
one of many possible alternative reactions occurs in temporal proximity 
to need reduction and hence is differentially reinforced. Such a receptor- 
effector connection may or may not be of super-threshold strength at 
the beginning of an experiment. The conditioned response also depends 
on temporal proximity to reinforcement but in this case a new receptor- 
effector connection is almost invariably established : the connection be- 
tween CS and UR. Hull concludes that “the differences between the 
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two forms of learning are superficial in nature, i.e. , that they do not in- 
volve the action of fundamentally different principles but only the 
differences in the conditions under which the principle operates” (127, 
p. 78). When the CR is considered basically akin to selective learning, 
the behavioral laws discovered in conditioning experiments can be ap- 
plied to, and integrated with, the phenomena of selective learning. The 
principles of conditioning in conjunction with the law of primary rein- 
forcement can be used in the deduction of more complex forms of 
learning. 

Although the law of primary reinforcement is closely akin to Thorn- 
dike’s law of effect, there is an important difference which Hull has 
made explicit. As the law has been repeatedly stated by Thorndike 
it is both a law of motivation and a law of learning. In a review of 
Thorndike’s work Hull (123) raised the question whether motivation 
(striving) produces the learning, or learning produces the motivation, 
or whether some third and still more basic process produces both, 
Thorndike’s formulation seems to imply that striving is to be considered 
primary: he defines a satisfier as that which an animal strives to attain 
or does nothing to avoid. Hull, on the other hand, considers learning 
(“strengthening”) primary and derives striving from the principles of 
conditioning and the primary law of reinforcement (need reduction) as 
basic assumptions. In his own words, “states of affairs which organisms 
will strive to attain are reinforcing agents, not because they will evoke 
striving, but they evoke striving now because at some time in the past 
they were potent reinforcing agents, thereby joining stimuli and re- 
sponses . . . which constitute the striving” (123, p. 822). The point is 
that Hull is willing to assume as originally given only those reinforcing- 
agents which satisfy basic biological needs and are linked with the or- 
ganism’s survival, and he then derives other motives or strivings with 
the aid of the principles of conditioning. In Thorndike’s writings, on 
the other hand, we find no such hierarchy of motives. The role of 
motives (wants, interests and attitudes in learning) is described as two- 
fold: (1) they determine what response a situation shall evoke, and (2) 
the satisfaction of wants strengthens S-R bonds (276). It is to the 
question of the origin of these wants, interests and attitudes that Hull 
addresses himself, seeking to derive them with the aid of conditioning 
principles. 

The gap between learning situations in which biological need reduc- 
tion occurs and the myriad of learning situations in which there is no 
such immediate primary reinforcement is bridged by the principle of 
secondary reinforcement. According to this principle, any receptor ac- 
tivity which regularly precedes a primary reinforcement will itself 
gradually become a reinforcing agent, The sight and smell of food as 
well as other stimuli emanating from a feeding compartment, such as 
the click of the mechanism releasing pellets into a Skinner box (22, 237), 
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are sources of secondary reinforcement. A stimulus such as a buzzer or 
tone which has been regularly associated with shock also acquires rein- 
forcing properties (67, 69, 74). The reinforcing power of token rewards, 
i.e., rewards which may subsequently be exchanged for a primary re- 
ward such as food may similarly be explained in terms of derived rein- 
forcement (39, 65, 179, 207, 319). The effectiveness of social rewards 
and punishment calls for similar explanatory concepts: a practically 
unlimited chain of higher-order conditioning must be assumed (127, 
177). 

Recently Spence (247) has suggested that it is the particular stimu- 
lus pattern at the time of reward which acquires secondary reinforcing 
properties. This reinforcing power is generalized to preceding stimulus 
patterns according to a temporal gradient. In this conceptualization the 
vexatious problem of backward action is eliminated. 

It is in Hull’s theoretical system that the law of effect has come to its 
full theoretical fruition: it has become a primary postulate with whose 
correctness a complex structure of deductions and theoretical interpre- 
tations must stand or fall. 

The Law of Effect as a Law of Performance 

It is possible to accept the law of effect in a descriptive sense, i.e., to 
recognize that rewarded responses are usually repeated in preference to 
non-rewarded ones, and yet to deny that the law of effect is a law of 
learning. Several writers, under the leadership of Tolman, have taken 
this position (64, 159, 287, 289, 292, 294, 315). A rigorous distinction is 
made between the acquisition and utilization of habits. The acquisition 
of habits depends on the formation of cognitive patterns within the 
organism which reflect the stimulus relationships in the environment. 
Such cognitive patterns Tolman has described as "sign-gestalt expecta- 
tions” or “hypotheses" (287, 288, 290, 291). Organisms come to accept 
one event as a sign or “local representation” of another event (293). 
Learning occurs when the subject has built up an expectation that a 
given sign in the environment will, via a behavior route, lead to a certain 
significate or outcome. These expectations result from the organism’s 
commerce with the environment and their acquisition is governed by 
such conditions as f requency, recency, and perceptual laws of stimulus 
organization (sign-gestalt formation). 18 Differential reward is not con- 
sidered a determinant of learning. Experienced reward does, however, 
play a role as a determinant of performance or utilization of habits. 
From a set of alternative responses to a stimulus (sign) that response is 
selected and performed whose consequence is most “demanded,” i.e., 
most rewarding in terms of the momentary motivational state of the 

18 In building up expectations the organism essentially reacts to the relative proba- 
bilities that signs in the environment will be followed by certain outcomes. A reinforce- 
ment theory stressing reaction to probability has been elaborated by Brunswik (21). 
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animal. Even though knowledge and need (learning and performance) 
are thus distinguished, behavior is always a joint function of both. 
When a subject moves toward a goal he must (1) have a need for that 
particular goal, and (2) have the knowledge that a given piece of be- 
havior on his part will lead to that goal. Hence needs and knowledge 
constitute an interdependent pattern or field (315). In this theoretical 
account, then, the law of effect is rejected as a law of learning and rele- 
gated to a secondary role as a condition of the moment-to-moment 
utilization of habits that have been acquired independently of effect. 
Only if motivation remains constant can behavior be accurately pre- 
dicted on the basis of the law of effect. 

Tolman calls his account of learning and performance a field 
theory 19 which may be applied to substitute stimulus (CR) learning, 
trial-and-error as well as more complex forms of learning such as 
"inferential” and "inventive learning” (290). The learning theory put 
forward by the chief exponent of field theory, Kurt Lewin, (161) is 
closely akin to that of Tolman. Lewin sharply distinguishes between 
learning as change in knowledge or cognitive structure (differentiation 
of unstructured areas, restructurization) and learning as change in 
motivation (changes in valences and values). Changes in cognitive struc- 
ture are ascribed in part to the same type of "forces” as govern percep- 
tual fields, in part to the impact of the needs, values, and hopes of the 
subject. Among the complex of forces governing changes in movitation 
reward is only a minor factor, for "forces governing this type of learning 
are related to the total area of factors which determine motivation and 
personality development ” (161, p. 239). Throughout his treatment 
Lewin emphasizes the need to distinguish the motivational from the 
cognitive problems, to study their separate laws, and then to determine 
the role of each type of factor in different learning situations. In this 
connection it should be noted that the distinction between learning 
and performance is not limited to the field-theoretical approaches. In 
Hull’s account, a parallel distinction is made between the principles 
governing habit-formation and the principles governing habit use. 
Whereas the concept of habit strength describes the degree of acquisition, 
the construct of effective reaction potential refers to the degree to which 
a habit is ready for performance. 20 Thus both S-R theory and field 
theory allow for the distinction between learning and performance. The 
difference is that for Hull motivational factors govern both acquisition 
and performance, whereas in the Tolman-Lewin formulation motiva- 
tional factors come into play only in the utilization of habits. 

One of the main experimental supports for the Tolman-Lewin view 

lB For another restatement of learning theory in field-theoretical terms, see Adams 

(!)• 

20 Effective reaction potential is a product of effective habit strength and drive, 
taking account of the total amount of inhibitory potential (127). 
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has been the phenomenon of latent learning, When a reward is intro- 
duced after a series of unrewarded trials in a maze, an improvement in 
performance occurs which far exceeds the usual effects of a single re- 
ward. A substantial part of this sudden improvement may then be 
ascribed to learning (formation of sign-gestalt expectations) which took 
place during the unrewarded trials but was not utilized in the absence 
of reward. Such analysis is strictly in accord with Tolman’s theory. 
The phenomenon of latent learning was first demonstrated by Blodgett 
in 1929 (14) and has been repeatedly confirmed since (49, 51, 102, 103, 
295, 304). A recent repetition of Blodgett’s experiment by Reynolds 
(220), however, failed to show latent learning. This result has, at least 
temporarily, detracted somewhat from the support which the latent 
learning experiment has provided for the field-theoretical view. To 
the extent, however, that latent learning has been successfully demon- 
strated it provides a serious challenge to any view that would make 
learning a cumulative function of successive reinforcements by reward. 
As Leeper (160) has pointed out, it is impossible to argue that some other 
reward, such as satisfaction of the exploratory drive, accounts for the 
learning. Such a reward should reinforce exploratory behavior and not 
running to the goal with few errors once a food reward is introduced. In 
this connection it is interesting to note that in a recent attempt to de- 
rive the facts of latent learning within the framework of Hull’s the- 
oretical system Seward (232) was forced to abandon Hull's postulate 
(No. 4) that increments from successive reinforcement summate to 
yield a combined habit strength. Instead, he had to assume that con- 
ditioning is independent of reinforcement and is complete in one trial 
when S and R are simultaneous. On the other hand, Buxton (28) was 
able formally to derive the phenomenon of latent learning in terms of 
field-theoretical principles. 

Utilization of a habit depends on the presence of appropriate motiva- 
tion (appetite for the goal object). Performance of a given act does not 
always depend on the presence of any one particular motive : different 
motives may be equivalent to each other in their capacity to evoke 
performance to the extent that the outcome remains congruent with the 
motive that prevails at the time of performance (61, 62, 63, 64). Within 
limits, therefore, changes in motivation may leave behavioral responses 
(habit utilization) relatively unaffected, or lead to only partial changes 
in performance. Even total removal of reward may lead to an only 
temporary disturbance in performance rather than a permanent disin- 
tegration of a habit (310). When the subject is confronted with a choice, 
however, motivation may serve as the basis on which selection among 
alternative responses is made. On this hypothesis, if a subject were to 
learn that one route in a maze leads to food and another route leads to 
water, he would be expected to choose the food route when hungry and 
the water route when thirsty. This deduction has been repeatedly put 
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to experimental test but the results and interpretations have thus far 
remained contradictory. 

In 1 933 Hull (121) performed an experiment in which he was able 
to establish differential reactions to the same maze environment on the 
basis of different drives (hunger and thirst). He trained his rats in a 
simple two-route maze. The animals had to traverse one arm of the 
maze when they were hungry and were fed in the goal chamber. On 
days when they were thirsty, the animals had to traverse the other arm 
of the maze and received water in the same goal chamber. Hull’s animals 
learned this discrimination only with great difficulty : 25 training periods 
of eight days each were needed before 80 percent correct responses 
were given on the first run of a series. Hull interpreted his results to 
show that animals can be conditioned to internal conditions (drive 
stimuli). The slowness and difficulty of the training would seem to 
run counter to the predictions of a sign-gestalt theory. The experiment 
was then repeated by Leeper ( 159 ) who fully realized its crucial signifi- 
cance for a field theory of learning. He introduced an important modifi- 
cation. In order to make a clear perceptual differentiation of the situa- 
tion possible for the subjects, he constructed a maze with two end-boxes, 
one containing food and the other water. When a rat made an incorrect 
choice, it entered the goal chamber which contained the reward not 
desired under its prevailing motivation. Thus the rat had continual 
opportunity to build up an expectation of "what leads to what.” 
Whereas in Hull’s experiment the rat was simply blocked on an incorrect 
trial, Leeper's subject acquired information about means-end relations 
on every trial, correct and incorrect. As a result Leeper’s rats required 
only one eight-day period to reach approximately the same criterion 
as Hull’s rats after 25 eight-day periods. 21 Leeper felt that his results 
fully support Tolman’s theory and the distinction between acquisition 
and utilization of habits. 

The results of Hull’s and Leeper’s experiment are reviewed in some 
detail to show how virtually the same experiment lends itself to alterna- 
tive interpretations, one based on the law of effect and assuming sum- 
mation of reinforcement, the other making learning independent of 
effect. Subsequent studies of the same problem have remained equally 
inconclusive Kendler (146) raised the question whether results similar 
to Hull’s and Leeper’s could be obtained if both hunger and thirst 
drives were simultaneously present during the training period. He 
found that animals learned to respond appropriately on the test trials, 
i.e., in accordance with the motivation prevailing during the test trials. 
Although this result would seem to be exactly in accord with Leeper’s 

!l Hull explained the difference between his and Leeper’s results by suggesting that 
after the first few trials Leeper's animals probably operated under both drives, thus 
being rewarded no matter which compartment they entered (127). In a reply to Hull, 
Leeper denied on the basis of his experimental records that such was the case (160). 
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interpretation, Kendler offers alternative explanations within the frame- 
work of an effect theory. He speculates that (1) only those drive stimuli 
which are reduced during the training trial are associated with the re- 
warded response, and (2) invokes anticipatory goal reactions as possible 
differential cues. 

Recently Spence and Lippitt (249) have reported “an experimental 
test of the sign-gestalt theory of trial and error" which is again con- 
cerned with the utilization of alternative habits under different motiva- 
tions. The subjects (white rats) in this experiment were motivated by 
thirst and given 12 days experience in a simple Y-maze, One arm of the 
maze always led to water. The other arm led to food for one-half of the 
subjects, to an empty goal box for the other half. The test trials were 
run under hunger motivation, with the thirst drive satiated. During 
these test trials the subjects continued to go down the alley leading to 
water and the group which had previously experienced food was not 
superior to the other group in learning to choose the food alley. These 
results are clearly contrary to what would be expected in terms of 
sign-gestalt theory. However, the same authors had previously re- 
ported (in abstract form) an experiment which was more in accord with 
Tolman’s theory (248). In this case, the subjects were satiated for 
both food and water during the training series and found water at one 
end of the maze and food at the other. When made hungry or thirsty, 
the animals chose the alley leading to that goal which satisfied the need 
prevalant at the moment. At that time Spence and Lippitt concluded, 
“Latent learning does not occur in the situation where animals perceive 
the subsequent goal object while motivated for another, but latent learn- 
ing does occur where complete satiation made for no particular goal- 
directedness." Thus the results of Spence and Lippitt do not call 
necessarily for abandonment of a sign-gestalt interpretation but for a 
modification : experience with goal objects which satisfy a need and with 
goal objects for which there is no need during the training period are 
not equally effective in learning (acquisition of expectations) . 

Just as the facts and interpretations of latent learning remain under 
discussion, the general problem of the mechanism of discrimination 
learning is still the cause of serious disagreement between proponents 
and opponents of an S-R-effect theory of learning. The experimental 
fact, on which there is general agreement, is that subjects can learn to 
choose between positive stimuli and negative stimuli simultaneously 
presented, i.e., between stimuli response to which leads to reward and 
stimuli response to which fails to lead to reward or leads to punishment 
(150, 154, 156, 157, 170, 205, 239, 241, 245). Even before the discrimina- 
tion has been learned, the animal does not respond in a haphazard 
fashion but shows definite systematic response tendencies, or “hypoth- 
eses" (149, 150, 151, 152). It is around these systematic response 
tendencies during the pre-solution period that the main argument be- 
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tween "continuity” and "non-continuity” theories has been centered. 
According to the continuity theory, discrimination learning, like other 
types of learning, results from a cumulative process of building up an 
association between the positive stimulus cue and the response. Every 
time a response to the positive stimulus cue occurs and is followed by 
reward, the association between cue and response is strengthened; 
every time a response to the negative stimulus cue is made and fails to 
be followed by reward, the tendency to respond to this cue is weakened. 
Discrimination is established when the difference in the excitatory 
strengths of the positive and negative cues is sufficiently great to over- 
come other aspects of the total stimulus situation which are not con- 
sistently associated with reward or failure. In terms of this analysis 
discrimination learning is fully explained in terms of stimulus-response 
association and the law of effect. The continuity hypothesis of dis- 
crimination learning has been sponsored by Spence (240, 241, 242, 243, 
244, 245), and McCulloch (168, 169, 170). Spence has shown that a 
derivation of "hypothesis” behavior during the pre-solution period is 
possible in the framework of the continuity theory (240) as is a deriva- 
tion of the special type of discrimination learning studied in “trans- 
position” experiments (242). According to the noncontinuity theory 
as it is interpreted today, the animal learning a discrimination changes 
from one systematic mode of response (hypothesis) to another until the 
problem is solved but practice on the unsuccessful hypothesis does not 
contribute to the learning of the correct association. This view has been 
defended primarily by Lashley (157, 158), Kreshvesky (153, 154), and 
Haire (97, 98, 99). 

Both theories have been supported by important experimental 
investigations, A survey of the evidence suggests that the argument is 
still in an inconclusive stage. The crucial experimental question is 
whether associations are formed during the pre-solution period which 
significantly influence the subsequent establishment of the discrimina- 
tion. In a typical investigation of this problem, the significance of the 
stimulus cues is reversed during the pre-solution period; the cue which 
is to be positive on the test trials is made negative and vice versa. 
According to the continuity theory such reversal should slow down 
learning on the test trials, according to the non-continuity theory the 
reversal should have no effect. The first demonstration of the cumula- 
tive effect of training was presented by McCulloch and Pratt (170) 
in their study of weight discrimination by white rats. Preliminary 
training to the lighter of two weights slowed up learning when the 
heavier weight was made positive. In a study of visual form discrimina- 
tion habits of chimpanzees Spence (241) similarly found that the 
establishment of a discrimination was directly dependent on the excita- 
tory strengths of the positive and negative stimuli. The greater the 
relative number of reinforcements a given stimulus had received in a 
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series of learning tasks, the easier it was to establish a positive response 
to that stimulus. The correlations obtained were positive, high and 
significant. Spence also showed that sudden learning (insight) like 
gradual learning, was positively correlated with the excitatory strengths 
of the positive and negative stimuli as determined by the number of 
reinforcements (243). 

Not all the evidence has been on the side of the continuity theory, 
however. Using the technique of reversing the positive and negative 
stimuli during the pre-solution period, Kreshevsky (154) found that the 
reversal did not significantly affect the learning of the correct solution. 
Lashley (157) after repeating Spence’s study of form discrimination 
with rats argued that evidence from positive correlations between 
number of reinforcements and ease of discrimination learning is incon- 
clusive. He considers the assumption unwarranted that a high corre- 
lation between two arrays cannot exist when the values of one are in 
part determined by chance. In computing his own correlation Lashley 
omitted all trials showing systematic reaction to position, thereby 
changing the excitatory values of the stimuli by random measures 
from 0 to more than 100 percent, and yet the correlations between 
number of reinforcements and ease of discrimination learning were not 
reduced but slightly increased! Lashley then offered experimental 
proof against the assumption that all stimuli acting at the time of a 
response tend to be associated with that response. Rats who learned 
to respond in terms of size did not at the same time learn to respond in 
terms of that shape which was consistently associated with the positive 
stimulus. “If the animals are given a set to react to one aspect of a 
stimulus situation . . , large amounts of training do not establish asso- 
ciation with other aspects, so long as the original set remains effective 
for reaching the food” (157, p. 259). Instead, Lashley emphasizes the 
role of perceptual organization and attention in determining which 
aspects of the stimulus situation will be associated with the response. 

The controversy still continues. In 1945 Spence (245) published a 
carefully designed experiment in which he again used the technique of 
cue reversal. Great care was taken to control all relevant factors in- 
cluding position habit. Cue reversal significantly retarded learning and 
again led Spence to conclude that the development of an association be- 
tween a cue and a response is a cumulative process which is independent 
of systematic response tendencies to other cues during the training 
period. Discrimination depends on the number of times the animal has 
been rewarded in the presence of the positive cue. 

In spite of Spence’s impressive results, the last word has not been 
spoken. Recently Lashley and Wade (158) published a stringent criti- 
cism of “neo-Pavlovianism.” They singled out for theoretical and 
experimental attack two assumptions central to “neo-Pavlovianism,” 
i.e. , to modern effect theory: (1) that in conditioning all aspects of the 
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stimulus situation are associated with the reaction, and (2) that the 
effects of reinforcement spread to stimuli other than those present during 
training (stimulus generalization). Lashley and Wade believe these 
principles to be contrary to fact. In their experiments, groups of sub- 
jects (rats, monkeys) were trained to react positively to a given stimulus. 
The stimulus would then be opposed in a discrimination experiment to 
another on the same stimulus dimension. Is some cases, the reaction 
to the initial stimulus was reinforced, i.e., the initial stimulus remained 
positive; in other cases, the reaction to the initially positive stimulus 
was extinguished, i.e., it was made negative in the differential training. 
The rates at which discriminations could be established under these 
two conditions were compared. In every case differential training was 
faster when the initial reaction was extinguished than when it was 
reinforced! The differences were consistent though not statistically 
reliable. On the basis of these results Lashley and Wade reassert that 
in discrimination relatively few, often not more than one, aspects of the 
stimulus are effective in the choice reaction of the animal. “A definite 
attribute of the stimulus is 'abstracted' and forms the basis of reaction; 
other attributes are not sensed at all or disregarded” (158, p. 81). 
Stimulus generalization is ascribed to failure of association. When a 
subject responds to a stimulus to which he was not originally trained, 
he does so because of his failure to attend to those characteristics which 
distinguish the training stimulus from the stimulus to which the reac- 
tion is generalized. On the other hand, a subject establishes a differ- 
entiation when he forms associations with such aspects of the stimuli 
as he had not attended to in earlier stages of his training. The burden 
of the argument is a denial of automatic irradiation and indiscriminate 
association of stimuli with responses by virtue of sheer temporal proxim- 
ity to a reinforcing state of affairs. 

The crucial conceptual distinction which the field-theoretical ap- 
proach has introduced is between learning and performance. Though 
recognizing the logical status of this distinction McGeoch seriously 
questioned its experimental usefulness. “The only way we can know 
that learning has occurred is by an observation of successive perform- 
ances since learning is a relation between successive performances. 

. . . Assertions that motive and effect influence performance but not 
learning become meaningless in the absence of quantitative demonstra- 
tion, a demonstration which cannot be made without measurements of 
performance” (171, p. 599). Replying to this criticism, Leeper (160) 
pointed out that different test conditions may result in very different 
performances following the same learning situation. In strict conform- 
ance with McGeoch’s view one would then have to conclude that 
any learning situation results in a large number of ‘ 'learnings.” A much 
more parsimonious approach consists in the use of such distinct inter- 
vening variables as learning and performance, or, in Hull’s language, 
habit-formation and habit-evocation. 
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Another difficulty which results from the distinction between learn- 
ing and performance is the conceptual gap which is left between knowl- 
edge and action, If learning consists in the building-up of expectations, 
how is expectation translated into action, and how can the specific 
character of action be predicted? The field theory predicts what a sub- 
ject will come to expect but it fails to predict in any specific way what 
he will do as a result of the expectation. Guthrie, who has long been 
concerned with the analysis and prediction of particular movements, 
has expressed this criticism as follows: “Signs, in Tolman’s theory, 
occasion in the rat realization , or cognition, or judgment, or hypotheses, 
or abstraction , but they do not occasion action ” (90, p. 172). Hilgard and 
Marquis (108) regard this failure to predict specific action as a weakness 
but also see in it a source of strength. A breadth of interpretation is 
possible which strict conditioning theories do not allow. A variety of 
performances can be grouped together in terms of the purpose which 
they serve without regard to particular details of movement. 

Rejection of the Law of Effect as a Basic Principle of Learning or 

Performance 

It is his insistence on an analysis of the learning process in terms 
of specific stimuli and specific response movements which has led 
Guthrie (88, 89, 90, 91, 92, 93, 94) to the formulation of a theory in 
which the law of effect has no place as either a basic principle of learn- 
ing or of performance. Guthrie distinguishes strictly between acts 
and movements. An act is a class of movements defined by the end 
result. The law of effect with its emphasis on the consequences of S-R 
connections thus applies primarily to acts. In Guthrie’s view, however, 
it is to movements and not acts that the basic laws of learning must 
refer. The achievement of an act, i.e., of an effect or end result, is “com- 
pletely dependent on the acquisition, through learning, of a specific 
stereotyped movement or set of movements for the accomplishment of 
the effect that defines the act” (94, p. 51). As to the association of 
stimuli and response movements, it is entirely explained in terms of one 
basic principle — association by contiguity. It is always the last move- 
ment or set of movements made simultaneously with a given stimulus 
that is repeated on recurrence of the stimulus. Only a single coincidence 
of stimulus and response is necessary to establish the association. Thus 
learning is accomplished in one trial but so is unlearning, for one presen- 
tation of the stimulus on which the response is not made destroys the 
association. Although all learning is complete in one trial, the process 
of acquisition proceeds only gradually because of the continuously 
changing nature of the components of the stimulus situation. Only 
when the response movements have been conditioned to the various 
components of the stimulus situation — exteroceptive, proprioceptive, 
and interoceptive — can the response be reliably evoked. 
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Guthrie’s description of the basic processes in learning is made inde- 
pendently of the action of rewards and punishments. The practical 
efficacy of reward and punishment can be satisfactorily accounted for 
in terms of stimulus-response contiguity. It is always the response last 
associated with a stimulus which will be repeated when the stimulus 
recurs. A successful response either removes the organism from the 
stimulus situation (for example, by allowing escape from a puzzle-box) 
or so alters it (for example by removal of internal drive-stimuli through 
feeding) that no new associations with the situation can be formed. 
Similarly the effectiveness of punishment depends on what it makes the 
organism do in the presence of a stimulus (for example, make with- 
drawal movements), not on what it makes him feel. It is the function 
of reinforcing agents to protect the associations made. Effect prevents 
unlearning and allows the law of recency to operate. 

Guthrie’s explanation of effect in terms of changed stimulus context 
has been criticized on the ground that rewards may be effective even 
though they do not clearly alter the stimulus situation (169). Eating 
a small pellet of food, for example, does not remove the drive stimulus 
and yet is effective in strengthening an S-R connection, Guthrie be- 
lieves that even with a small reward the stimulus situation is materially 
altered since the “annoyance"— the restless, excited behavior which 
precedes feeding — is removed though the “annoyer” (the drive stimuli 
maintaining the animal’s search for food) persists. Removal of annoy- 
ance removes the cues for successful action and thus prevents unlearning 
(92). This reformulation is not entirely conclusive. After eating, the 
animal may soon again be in an aroused state, the annoyance may re- 
turn, and previous responses would be expected to be unlearned. “It 
is obvious that something has changed when the animal ceases to run 
or to push levers and begins to eat, but a theory must state more pre- 
cisely just what the change is which guarantees the learning of the prior 
response if the theory is to be verified experimentally” (108, p. 92). 

As to experimental verification of the theory, the most outstanding 
work is that of Guthrie and Horton (95) who reported a detailed, well 
documented study of cats’ behavior in a puzzle box. The filmed records 
of the animals learning to escape from a puzzle box are well in accord 
with Guthrie's theoretical expectations. The most striking characteris- 
tic of the subjects’ behavior was its repetitiousness. The movements 
immediately preceding escape were distinguished by their stereotypy 
since removal from the situation presumably protected the last associa- 
tion formed. Whatever variability of behavior appeared is explained by 
changes in the stimulus situations which caused new associations to be 
established. 

More recently, an experimental study based on Guthrie’s theory 
was reported by Seward (230) who was interested in testing the "finality 
theory of reinforcement.” He divided his subjects (rats) into two 
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groups, One group was given food upon pressing a bar, the other group 
was removed from the experimental situation when it had made the 
response. Though both groups learned to press the bar, the reward 
group was clearly superior to the removal group. Though removal from 
the situation protects the last association, the total effectiveness of 
reward cannot be ascribed to the termination of the situation. 

Stimulus-response contiguity theory has been criticized as not easily 
lending itself to experimental test (108, 246). The continually changing 
stimulus elements, especially the subjects’ own proprioceptive impulses, 
are not readily controlled and manipulated. Usually changes in the 
stimulus situation have to be inferred from the failure of a response to 
be repeated. Such inferences are, of course, circular. The main diffi- 
culty lies in the highly specific nature of the stimuli and responses in 
terms of which the analysis is made. The theory is distinguished by logi- 
cal and consistent formulation; it exemplifies a coherent account of 
learning developed without recourse to a law of effect. But it is still 
greatly in need of experimental verification. 

The Role of Effect in Complex Learning 

Our survey has shown the pivotal role of the principle of effect in 
the construction of a systematic theory of learning. Experimental tests 
of the various theoretical propositions have been conducted almost 
exclusively in the animal laboratory (Hull, Tolman, Guthrie) or with hu- 
man subjects in simple rote-learning situations (Thorndike). “Proofs” 
and “disproofs” of the law of effect in such experiments still leave 
open the question of what role effect (satisfaction) plays in more complex 
types of learning. The question is sharpened by two types of observa- 
tion; on the one hand much learning occurs in the absence of demon- 
strable drive reduction, and on the other hand satisfying responses 
sometimes fail to be repeated (3). The challenge of complex adult learn- 
ing which so often seems to defy the principle of effect, can be met in 
two ways. It is possible, with Mowrer (185), to defend the law of effect 
as a universal principle of learning despite the apparent failure of many 
learners to obey it — by reformulating the law of effect so as to encom- 
pass these more complex forms of learning. On the other hand, one 
may assert, as Allport has done (3, 4), that the law of effect holds only 
for animals, small children, mental defectives and in some peripheral 
phases of adult learning, whereas different principles, such as ego- 
involvement and active participation, need to be invoked in the analysis 
of more complex adult learning. Thus, for Allport the law of effect is 
not a law of learning but merely one of the many conditions which may 
favor learning — a condition, moreover, which applies only to a very 
circumscribed segment of learning behavior. The controversy between 
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these two view's is still in progress and is well exemplified by the recent 
symposium on "The Ego and the Law of Effect." 22 

In attempting to extend the law of effect to learning which is not 
motivated by physiological need reduction the concept of derived or 
secondary drive plays a central role. A secondary drive is a learned drive. 
"These secondary drives are acquired on the basis of primary drives, 
represent elaborations of them, and serve as a facade behind which the 
functions of the underlying innate drives are hidden” (177, p. 19). 
Under the heading of derived drives come anxiety and anger and such 
social needs as pride, ambition, and desire for social approval. Miller 
and Dollard (177) suggest that self-induced stimulation is the basis 
of acquired drives as well as the basis of acquired rewards and purposes. 
"Most of the responses which are the basis of socially significant ac- 
quired drives and acquired rewards are internal responses ...” (p. 55). 
In the course of learning, any stimulus cue which acquires the ability 
to reduce a secondary drive acquires reward value. 

Secondary drives are learned but they may also serve as the basis 
for further learning since the reduction of a secondary drive is a reward 
which may reinforce S-R connections in accordance with the law of 
effect just as does the reduction of a physiological drive, By postulation 
of such derived drives the gap between animal learning and social learn- 
ing may be bridged and analysis of the latter in terms of stimulus, re- 
sponse, and effect (tension reduction) is made possible. Miller and Dol- 
lard’s book, Social Learning and Imitation (177) represents an attempt 
to apply this analysis to such social behavior as imitation, leadership 
and crowd action. May (175) has attacked the problem of war and peace 
in a similar conceptual framework, and Whiting (316) has applied 
stimulus-response-effect analysis to the process of socialization in a 
primitive society. In terms of such analyses, a complex network of 
acquired drives develops out of the primary biological drives, and it is 
through the reduction of these acquired drives that the process of sociali- 
zation operates. When adult learning seems to proceed without obvious 
drive reduction we must search for the derived drives whose satisfaction 
makes learning possible. 

The concept of derived drive has been experimentally fruitful. A 
series of experiments conducted over the last ten years or so by Mowrer 
is a notable demonstration of the extent to which this concept has pro- 
duced testable hypotheses. Mowrer has made a determined effort to 
explore the implications of the law of effect to their limits and to erect 
upon the foundations of this law a unitary account of learning which 

22 Psychol. Rev., 1946, 53, No. 6 (4, 185, 221). 
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would have equal applicability to animal learning and to complex 
social learning, The concept of derived drive appears to be the touch- 
stone of this theoretical edifice, Mowrer has demonstrated, for example, 
that preparatory set or expectancy may function as a drive whose reduc- 
tion serves as a motivating factor in learning (180, 183, 190). A state 
of tension or discomfort arises not only from the presence of a basic 
organic need but also from the anticipation of the recurrence of one or 
more of these needs. Not only does the application of an electric shock 
result in tension but also the anticipation of the shock, and this antici- 
pation constitutes a derived drive, This state of tension is high before 
the occurrence of the punishment and low immediately thereafter (183). 
Reduction of this anticipatory tension is rewarding. "Other things 
being equal, the greater the extent of the drop in the expectancy-tension 
after the occurrence of a stimulus-response sequence the greater the 
reinforcing or learning-inducing value of this drop” (183, p, 38). 

Mowrer’s stimulus-response analysis of anxiety (181, 182, 190) and 
fear (189) is closely related to his conception of expectancy. Anxiety is 
the (learned) anticipation of a noxious stimulus. This anticipation, 
which is a source of tension and discomfort and hence acquires drive 
quality, results in a variety of acts from which are selected and fixated 
(by the law of effect) those forms of behavior which are most instru- 
mental in the reduction of anxiety. The postulation of anxiety and 
anxiety-reduction explains the seemingly paradoxical finding that a 
conditioning procedure which permits avoidance of a shock results in 
better conditioning than a procedure which merely allows escape from 
the shock or mitigation of the shock (17, 190). Avoidance results in 
anxiety-reduction which provides reinforcement even though the noxi- 
ous stimulus itself is not delivered. The effectiveness of avoidance and 
concomitant anxiety reduction depends in part on the temporal se- 
quence of the noxious stimuli. If shocks are presented at regular tem- 
poral intervals, better conditioning results than with irregularly spaced 
presentation. When stimuli come in a regular order, anxiety mounts to 
a maximum as the time approaches for presentation of the stimulus, then 
drops. With irregular presentation, the subjects are "kept in a more or 
less chronic state of apprehension or suspense” (182, p. 510). Mowrer 
and Lamoreaux have also been able to show that in avoidance con- 
ditioning, the CS becomes a source of anxiety whose termination is 
rewarding in itself independently of the reinforcement provided by 
avoidance of the shock (189). As a result, conditioning is better if the 
CS (source of anxiety) terminates at the moment when the conditioned 
response occurs than if it ceases either before or after the response. The 
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fact that the CS in avoidance training becomes itself a source of anxiety 
and thereby a motivating factor in its own right helps to establish con- 
ditioned avoidance responses which are radically different from the 
responses made to the unconditioned stimulus. Thus some of Mowrer 
and Lamoreaux’s rats were trained to jump in order to terminate the 
CS whereas the “correct” response to the UcS was running, and vice 
versa. Mowrer and Lamoreaux conclude that there are two distinct 
sources of reinforcement in avoidance conditioning, the one tending to 
strengthen the connection between fear and whatever response reduces 
the fear and the other which strengthens the connection between fear 
and whatever response eliminates the situation (shock and fear com- 
bined). They believe that the recognition of fear as a secondary motive 
in learning has important systematic implications for the law of effect. 
“In this way so-called anticipatory or ‘foresightful’ behavior can be 
made to conform very acceptably to the requirements of the Law of 
Effect, and seemingly 'purposive' or 'teleological,' responses are ac- 
counted for well within the framework of scientific causation” (189, 
p ' 48 )‘ 

Even though such intervening variables (secondary drives) as an- 
ticipation, anxiety, and fear 23 help to subsume under the law of effect 
behavior which is seemingly foresightful or purposive, a vexatious prob- 
lem remains. Why do organisms so frequently persist in behavior which 
is clearly punishing, or, at least, more punishing than rewarding? It 
is, of course, possible to argue that whatever behavior subjects persist 
in must somehow be rewarding or it would be abandoned. Such reason- 
ing would, however, be clearly circular and would beg the very question 
at issue in discussions of the law of effect. Recently Mowrer and Ullman 
(192) have attacked this problem of “non-integrative” learning. They 
believe that the key to the riddle of non-integrative learning lies in the 
temporal pattern of rewards and punishments. Many acts have con- 
sequences which are both rewarding and punishing. According to the 
gradient of reinforcement, those consequences which immediately follow 
the act will be more influential in learning than temporally remote 
effects. Integrative learning occurs when remote consequences are 
through symbolic behavior brought into the psychological present. 
Non-integrative learning takes place when the organism fails to make 
the symbolic bridge to the future and remains at the mercy of the im- 

23 Another derived drive is fatigue. In extinction, fatigue is the motivation competing 
with the performance of the act. Not to respond reduces the fatigue and is therefore 
rewarding. This analysis is supported experimentally by the fact that effortfulness of 
task is inversely related to the number of extinction responses. Thus the process of ex- 
tinction is explained in terms of effect of need reduction (186). 
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mediate gradient of reinforcement. It is the failure to react to remote 
punishing consequences which is the basis for the persistence of behavior 
in the face of punishment. This analysis holds well for the behavior of 
Mowrer and Ullman’s rats who, indeed, showed only a very limited 
capacity to learn in terms of temporally remote consequences. But 
what about non-integrative learning in humans capable of reactions to 
symbols? In face of this difficulty the authors unfortunately fall back 
on a question-begging argument: “The fact that certain habits or 
‘traits of character’ may persist in the face of consistent punishment, 
raises a more difficult problem. One possibility is that punishment, 
which is observed, is offset by self-administered reward, which is not 
observed” (192, p. 85). But on this supposition the law of effect is in- 
capable of disproof since whatever learning occurs is -ipso facto con- 
sidered the result of reward administered either visibly by others or 
invisibly and (often unobservably) by the self. Mowrer and Ullman 
probably sense this difficulty since they admit in the same paper that 
"there may be more to the problem than this and that the ‘strength of 
the total ego’ seems capable in ways which have not yet been clearly 
analyzed, of being mobilized in support of any single part (habit) for 
which the going is particularly hard” (p. 85). 

In the symposium already referred to, Mowrer (185) has further 
clarified his view of the relation between the law of effect and "ego- 
processes.” He does not believe that the type of learning which is 
described as "ego-involved” poses problems which the law of effect 
cannot answer. For the term "ego-involvement” he would substitute 
the term “interest”; for “interest” he would in turn substitute “emo- 
tional arousal”; and emotional arousal (e.g., fear) is a derived drive 
whose satisfaction is not in principle different from the reduction of 
the so-called primary, or biological drives. By such successive reductions 
Mowrer achieves the unification of all motives which may be effective 
in learning and staunchly defends the law of effect as the sole true law 
of learning. "Learning occurs when and only when a drive is reduced, 
a problem solved, a satisfaction derived, but , , . this satisfaction may 
stem from the reduction of either a primary or secondary drive” (185, 
p. 332). 24 It is the mediation of effect through symbolic processes, 

24 In his most recent statement Professor Mowrer (185a) has reversed his position 
to some extent. He now admits that the law of effect has failed as a universal monistic 
principle of learning and suggests that it may be necessary to assume two basic learning 
processes: (1) the processes whereby solutions to problems, i.e., ordinary habits, are 
acquired and (2) the processes whereby emotional learning or “conditioning" takes place. 
Habit formation or problem solving is mediated by the law of effect whereas conditioning 
or emotional learning is conceptualized as a process of “association” independent of 
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especially self-administered rewards, which distinguishes ego-involved 
learning from learning which is reinforced by the reduction of physio- 
logical needs. 

An attempt rather similar to Mowrer’s, though less orthodox in 
formulation, to bridge the gap between ego-psychology and the law 
of effect was made by Rice (221). Rice admits that the law cannot be 
held if it means that success or satisfaction leads to (a) a perpetuation 
of specific responses and/or (b) the repeated choice of the same specific 
goal object. When normal adults repeat one of these two elements, 
it is usually with a variation in the other, or both response and goal 
are varied. But even though either specific response sequence or goal 
or both may be varied, it may still be true that something about the 
activity is repeated, such as the “interest” that is involved. Rice sug- 
gests a reformulation of the law of effect. Success or satisfaction does 
not stamp in a specific stimulus-response connection but it does con- 
firm the learner's interest in the general range of problems in which he 
has been successful. To the old question, what is it that success “stamps 
in," Rice then offers the answer that it is “interest.” Interests them- 
selves, even the processes to which the term ego refers, are acquired and 
perpetuated through the operation of this modified law of effect. Like 
Mowrer, Rice believes that symbolic processes (self-administered re- 
wards) play an important role in the mediation of the effect. “That 
core of the act which constitutes the 'interest' is the feature of it which 
is most likely to be symbolized and repeatedly confirmed through ap- 
proval of its symbol” (221 p. 316). 

The papers of Mowrer and Rice make it clear that it is logically 
possible to reformulate the law of effect and to describe its operation in 
such a way as to make it universally applicable to animal and “ego- 
involved” learning alike. But as Allport’s reply (4) to these papers 
shows, the psychologist who is primarily concerned with the analysis 
of ego-processes cannot be satisfied by such reductions and reformula- 
tions. Allport rejects the reduction of ego-processes to emotional arousal 
(secondary drives). Only certain emotional states are ego-involved and 
the two states should not be considered identical. Indeed, it may be 
the lack of emotional arousal which characterizes the smooth function- 
ing of some ego-processes. Any law of learning based on “reduction of 

pleasure or tension reduction, Mowrer also suggests that effect learning is mediated by 
the central nervous system whereas emotional learning is carried on by the autonomic 
nervous system. This reformulation considerably curtails the sphere of operation allotted 
to the law of effect. It remains to be seen whether this particular dichotomization will 
fare any better than previous attempts at mutually exclusive classifications of the 
learning process. 
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emotional tension" can not indiscriminately be applied to learning that 
proceeds from ego-interests. Allport staunchly maintains that the role 
of ego-processes is “irreducible,” There are other basic phenomena of 
learning for which effect theory cannot account satisfactorily, The 
favorable influence of motor activity and active participation resists 
reduction to pleasure or satisfaction. Above all, the ubiquitous direc- 
tive action of interests must be taken into account in any comprehensive 
theory of learning; ", , . learning proceeds because it is relevent to an 
interest system : it adds to knowledge, it differentiates items within the 
system, it broadens the range of equivalent stimuli .... Pleasure at- 
tending a single response, or even concatenations of response, is not 
decisive” (4, p. 346). The capacity to relate environmental events to 
one’s interest system, to discriminate between relevant and irrelevant 
means, far transcends the operation of effect, however broadly con- 
ceived. 

In the light of such considerations Allport considers the law of effect 
as a secondary principle in learning. He believes that satisfaction plays 
a decisive role only in those organisms and those situations in which 
ego-processes are not involved. In less mechanical forms of learning 
satisfaction loses importance and can be maintained only as a question- 
begging concept. The symbols which are invoked as mediators of self- 
reward are “vague molecular constructs that taper off into a kind of 
aimless triviality so far as explanatory power is concerned” (4, p. 344). 
At best, satisfaction or success serves as an indicator to the learner of 
how well he is adjusting to his problem. But though satisfaction helps 
the learner perceive the situation he uses this indication in a variable 
manner according to the pattern of interests that comprise his ego- 
structure. 

Conclusions 

In the course of this survey we have touched on learning behavior 
ranging from conditioned salivation to the development of interest 
systems in mature adults. To all these types of learning the law of 
effect has been applied by some, only to be rejected by others. In spite 
of an ever-growing volume of experimental and theoretical papers, 
agreement still seems to be a long way off. The golden jubilee of the 
law of effect will be celebrated in the not-too-distant future, but some 
of the basic issues for learning theory which the law of effect had hoped 
to answer still remain : 

1. What is the agent responsible for reinforcement? Is it association by con- 
tiguity; an OK reaction; the reduction of tension, primary or derived; the 
satisfaction of an interest; the confirmation of a cognitive expectation; the 
approval of the ego? 
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2. What is it that is reinforced? Is it a neural bond; an S-R connection; a 
perceptual organization; an interest system; the ego? 

3, What is the basic mechanism of reinforcement? Is it the lowering of 
synaptic resistances; cortical irradiation; or are the physiological processes 
which will find ultimate acceptance still unnamed? 

Perhaps there is no one right answer among these alternatives. 
Probably more than one of these principles and additional ones that 
remain to be discovered will be needed in an integrated theory of learn- 
ing. It is safe to say that at the present state of our knowledge the law 
of effect as a monistic principle explaining all learning has not been sub- 
stantiated. As one of the behavioral facts of learning, it cannot be 
gainsaid. 
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